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VISUAL DEFECTS AND READING 


Wa ter F. DEARBORN and HOoLiis M. LEVERETT 
Psycho-Educational Clinic, Harvard University 


INTRODUCTION 


During the rapid growth of interest in the 
problems of reading and learning to read, 
many articles dealing with the visual mecha- 
nism in relation to reading have appeared. 
The studies reported and the impressions de- 
rived from these studies vary greatly and in 
many different respects. It seems desirable, 
therefore, to attempt a general evaluation of 
the information available on the relationship 
between vision and reading. 

Any effort to coordinate the material on 
feading and vision must face numerous diffi- 
culties. The results of many investigations are 
not conclusive. The data presented are not 
always consistent. The discussions in the liter- 
ature do not yield precise information. In fact, 
there appears to be an excessive conflict of 
opinion among students and specialists in the 
field. As a result the diversity of thought in 
matters concerning the relationship between 
visual defects and reading is now so great that 
it may have to be described as confusion. 
Clarification of the problems at issue and a 
better understanding of the situation as a 
whole are obvious desiderata. Although with 
the evidence at hand it may not be possible 
to reach a satisfactory level of understanding 
with regard to the relationships involved, the 
gaps in our present knowledge can at least 
be 


Many difficulties derive from the fact that 
each study is characterized by specific ele- 
ments which tend to reduce the comparability 
of that study to any other. These specifics 
lead to many differences of opinion and con- 
tribute greatly to the apparent confusion con- 
cerning visual defects and reading. It is quite 
important, therefore, that the major differ- 
ences among studies be noted carefully. They 
effect not only the basic data but also the 
possible interpretations of these data. 

The populations treated vary from early 
grade school children to college and adult 
groups. Findings may be complicated by age, 


education, and other factors. Information 
based upon a study of one level of age or 
achievement does not necessarily hold for 
another. Additional problems arise when it is 
apparent that studies do not make an ade- 
quate random sampling of the population with 
which they are dealing. An _ investigation 
directed at obtaining an answer to some spe- 
cific question on reading or vision may not 
reouire a random sampling of a particular 
population; special selected groups of subjects 


may supply the desired information. Conse- - 


quently the data obtained in the study of spe- 
cial groups cannot be interpreted as holding 
true for any large portion of the population 
from which the special groups were drawn. 
In evaluating any study, it is always impor- 
tant to note (a) the nature of the individuals 
or groups investigated, (b) the extent to 
which a general population is sampled, and 
(c) the comparability of the cases treated to 
those of other studies. 

Experimental methodology in educational 
and other types of research permits wide 
choice. The problem at hand determines the 
method of study. There is no justification for 
selecting any particular method as especially 
desirable for use in investigating problems in 
vision and in reading. It is of greatest impor- 
tance, however, that differences in experi- 
mental methodology be considered in evalu- 
ating the literature and in comparing the re- 
sults of various studies. The method of study 
limits the interpretations which can be made 
from the data. Similarly, variations in meth- 
odology make it difficult to compare the result 
of different studies directly and decisively. 

Another important aspect of the differences 
among studies involves the measurement of 
vision and of reading. Practically all investi- 
gations make use of unique measuring instru- 
ments or measuring techniques. It follows that 
constancy either in the functions measured or 
in the reliability of the results cannot be 
assumed. Furthermore, there are great differ- 
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ences in the criteria adopted to distinguish 
between good and poor vision, between good 
and poor reading. These matters may be 
crucial. 

Any attempt to review the literature on 
visual defects and reading is greatly compli- 
cated by the unique elements found in each 
report. It is not possible to consider each 
paper in terms of its likenesses and differences 
with respect to all other comparable studies; 
it would be impractical and confusing, if 
attempted. For this reason, it appears advis- 
able to call attention to the ever present prob- 
lems and then to proceed with an analysis of 
the major questions to be raised regarding the 
relationships between visual defects and 
reading. 

One consideration stands above all the 
material with which this paper is concerned. 
It involves an axiom with which the discus- 
sion must begin. This axiom is simply: There 
is a relationship between vision and reading, 
and hence between visual defects and reading. 
At no time is it necessary to ask whether or 
not there is a relationship. The relationship 
between reading and seeing is fundamental 
and essential. Reading is a visual act. The 
blind cannot read in the usual manner, and 
the more closely an_ individual’s vision 
approaches that of blindness, the less likely 
it is that he can read. Similarly, the fact that 
good vision provides highly desirable equip- 
ment with which to read is unquestioned. 

There is thus no need to discuss the matter 
of whether or not visual conditions can effect 
reading. They do. However, real problems 
exist in this field, and all of them are much 
more complex than the axiom just stated. The 
basic issues are expressed best in question 
form: 

First, what is the relative importance of 
visual defects among all other causes of read- 
ing deficiency? With what frequency do visual 
defects occur in a sufficient degree of severity 
to be considered responsible for difficulty in 
reading? 

Second, what degree of eye defect should 
be regarded as really detrimental to reading? 

Third, which eye defects are most impor- 
tant from the viewpoint of their effect upon 
reading? Which ones are comparatively neg- 
ligible in that they do not affect reading to 
any significant degree? 

The answers to these major questions are 
by no means clear. It seems desirable, never- 
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theless, to consider them with whatever infor. 
mation is available since they constitute the 
fundamental issues. 


THE IMPORTANCE OF VISUAL DEFECTs 


The frequency with which eye defects 
appear as significant factors in cases having 
difficulty with reading has as yet not been 
adequately determined. It goes without say- 
ing that precise knowledge of this frequency 
will alone provide a sound basis for estimat- 
ing the importance of eye defects among other 
causes of deficiencies in reading. In order to 
obtain this information, it is necessary to 
(a) examine an adequate sample of a general 
well-defined population, (b) segregate those 
having difficulty in reading, and (c) deter- 
mine the number of those segregated whose 
difficulty can be attributed to visual defi- 
ciencies. The project is not simple. On the 
other hand, the project should not be clas- 
sified as impossible. It can be done, but it has 
not been done. Consequently, there are no 
data which provide the desired information. 

In the absence of the information desired, 
it is necessary to attempt some evaluation of 
the problem using the available data. The in- 
vestigations made and the general impressions 
derived from clinical experience have resulted 
in rather wide differences of opinion on the 
importance of visual defects in reading. Keep- 
ing in mind the differences among studies 
already noted, many discrepancies are under- 
standable. Some studies of selected reading 
disability cases have found a high frequency 
of visual defects. Other studies report signifi- 
cant differences in the frequency or severity 
of eye defects observed in groups of good and 
poor readers. On the other hand, there are 
reports which question the importance of eye 
defects as factors leading to difficulty in read- 
ing. Sampling the opinions expressed in the 
literature, it is possible to produce a roughly 
graduated series of positive and negative 
impressions. 

Betts (7) has stated that “Approximately 
go per cent of the non-readers and severely 
retarded readers have been found to require 
medical attention before receiving pedagog- 
ical help”. Farris (26, 27) has reported many 
differences in the reading progress of children 
having visual defects indicating that vision is 
of utmost importance in learning to read. 
Blake and Dearborn (9) found the evidence 
in their study to be “. . . convincing that 
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visual defects . . . are a contributing, if not 
a primary cause, of difficulty in reading”. 

Eames (21) stated that “The tendency is 
for the reading disability group to exhibit 
poorer vision . . .” A study by Fendrick (28) 
led to the cautious statement that “Differ- 
ences favoring an indication of a relatively 
inferior performance on measures of visual 
acuity for the reading disability group were 
isolated”. 

Other studies have not revealed a relation- 
ship between reading and visual defects. 
Swanson and Tiffin (48) found the data of 
their study to “. . . show that Betts’ tests of 
visual sensation and perception do not differ- 
entiate significantly between good and poor 
readers at the college freshman level”. Strom- 
berg (47) likewise found no differences in 
visual functions between good and poor read- 
ers at the college level. Imus, Rothney, and 
Bear (33) reported that “Ocular defects are 
not found more frequently among (1) stu- 
dents having reading disability or (2) stu- 
dents making low academic grades than 
among the rest of the group”. Data collected 
by Witty and Kopel (51, 52) were reported 
as leading “. . . to the conclusion that poor 
readers are not characterized by a higher in- 
cidence of visual defects and anomalies than 
are good readers”. Rather extreme statements 
such as “. . . these tests showed clearly that 
eye defects have nothing to do with reading 
ability” have appeared, but the data would 
be more precisely interpreted by saying “that 
no differences could be found in the peripheral 
vision of poor and good readers”. 

There is thus little agreement on the actual 
importance of visual defects in reading. Im- 
pressions range from (a) considering visual 
defects to be the dominant factors in nearly 
all cases of reading difficulty to (b) consid- 
ering vision to be a relatively insignificant 
factor in reading problems. The true situation 
is not clear. 

Since the relationship between visual func- 
tions and reading can be accepted as axio- 
matic, the study of selected subjects reveals 
only how difficult or easy it is to demonstrate 
the relationship in a given group using par- 
ticular measures of reading and of vision. It 
is not surprising, therefore, that the results 
of many studies show no consistency. Broader 
survey studies on general, well-defined popu- 
lations are necessary before a sound estimate 
can be made concerning either the frequency 
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with which eye defects interfere with reading 
or the importance of eye defects among other 
factors leading to difficulty in reading. 

Survey studies have been conducted on 
large populations but the tendency has been 
to survey visual conditions as such. In study- 
ing vision, these surveys have either disre- 
garded the importance of vision to reading or 
made unsound assumptions concerning the 
importance of certain conditions to reading. 
Generally speaking, studies of visual condi- 
tions are based upon an arbitrary designation 
of what constitutes a visual defect; reading 
is not considered. Consequently, the measures 
of visual functions do not provide the means 
for estimating the importance of the visual 
—_ from the viewpoint of reading prob- 
ems. 

When a defect in vision is defined as a 
deviation from an arbitrary standard of “nor- 
mality’’, the survey can produce unreasonable 
statistics. Summarizing numerous surveys 
and speaking generally of the extent 
of defective vision among school children, it 
may be said that 25 per cent would be an 
accurate representation. This would mean, of 
course, speaking in averages, since there are 
striking differences in the percentages reported 
by examiners in different localities.” (6) This 
report includes data from a study in which 
only 14 per cent of school children were con- 
sidered to have normal vision. The cases ex- 
amined were distributed as follows: Emme- 
tropia, 13.9% ; Hypermetropia, 36.2%; Com- 
pound and MHypermetropic Astigmatism, 
44.0%; Myopia, 1.4%; Compound Myopic 
Astigmatism, 3.5% Mixed Astigmatism, 
1.0%. Kempf, Jamen, and Collins (35) re- 
port more discouraging statistics; only 4 per 
cent of the children examined were found to 
have “normal” vision. Using a cycloplegic and 
making a retinoscopic examination, it was 
found that “. . . 88 per cent of the children 
had some degree of hyperopia or hyperopic 
astigmatism; about 1 per cent had mixed 
astigmatism; 7 per cent had myopia or myopic 
astigmatism; and less than 4 per cent were 
emmetropic.” 

Data based upon rigid and arbitrary cri- 
teria of visual defect can yield an exaggerated 
impression of the importance of eye defects. 
The material presented was selected to dem- 
onstrate this. It is possible also to minimize 
the apparent importance of eye defects by 
shifting the criteria used to distinguish be- 
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tween “normal” and “defective” vision. For 
the present purposes, however, the criteria of 
defective vision is the defect which has a Sig- 
nificant detrimental influence on reading or 
learning to read. Until data are gathered using 
this criterion, it will not be possible to esti- 
mate the frequency with which visual defects 
are actually detrimental to reading. The esti- 
mate that “. . . one-eighth of our entire school 
population are handicapped in their school 
work by eye defects . . .” (55) seems reason- 
able, but no factual data support the estimate. 
On the other hand, if it is estimated that 
“Almost half of the school population com- 
plains of eye discomfort in some form” (49), 
it does not follow that reading is disturbed 
for any known proportion of the school popu- 
lation. The extent to which reading is im- 
paired by eye defects in any given population 
remains to be determined. 


Studies of the vision of special groups of 
poor readers contribute little to the under- 
standing of the general importance of visual 
defects in reading. The cases are not usually 
drawn from any known and definable popula- 
tion; consequently, a comparable group of 
good readers cannot be obtained for study. 
The absence of data on the vision of normal 
readers (20, 24) reduces the interpretive 
value of a study just as the absence of data 
on reading functions limits the significance of 
visual surveys. The fact that a certain number 
of visual deficiencies was found in a group of 
poor readers is not subject to any generally 
useful interpretation. The importance of the 
visual defects as the possible source of the 
difficulties cannot be determined. 


It does not appear that either (a) the fre- 
quency with which visual defects may be con- 
sidered responsible for difficulty in reading or 
(b) the relative importance of eye defects 
among all causes of reading deficiency can be 
accurately determined with the information 
on hand. In the light of the numerous prob- 
lems involved, this is not surprising. The 
greatest difficulty, perhaps, derives from the 
absence of a criterion by which to detect the 
eye defects which are sufficiently severe to 
affect reading. If the exact degree of defect 
which interferes with reading were known, the 
data obtained in the surveys of eye conditions 
would yield the desired information. There is, 
however, a further complication in that a 
degree of defect sufficient to cause trouble for 
one person may be readily tolerated by an- 


JOURNAL OF EXPERIMENTAL EDUCATION 


[ Vol. 13, No.3 


other. Thus, because of these personal equa. 
tions as to tolerance, the degree of defect 
which interferes with reading will obviously 
vary from person to person. The degree of 
defect which on the average interferes with 
reading is what needs to be determined. 


Tue SEVERITY oF Eve DEFECTS THar 
ARE DETRIMENTAL 


Reading disability may be accounted for 
by many different factors. Among these are 
visual defects. When a case of reading dis. 
ability is found to have some eye defect, it 
becomes necessary to determine the influence 
of that defect as one of the possible causes 
for the difficulty. If the eye defect is minor, 
the likelihood of its being the cause of diffi. 
culty is not great; other factors then require 
careful consideration. On the other hand, if 
the visual defect is severe, it is more likely to 
be the cause of difficulty. The question of the 
severity of eye defects which are really detri- 
mental to reading arises immediately. To state 
that some specific measured degree of defect 
of any type is detrimental to reading would 
be quite arbitrary. The severity of any defect 
must be determined in terms of its effect upon 
the reading process. 

Not every deviation from “normal” can be 
designated as the cause of difficulty. The de- 
sirability of the best possible visual function 
is not to be underestimated, but several mat- 
ters must be considered before any eye defect 
is labelled as the cause of reading disability. 
The following questions must be weighed in 

case: 


1. Is the measured defect so great that the 
eye is not capable of making the neces- 
sary compensatory adjustments to avoid 
difficulty in reading? 

2. Does the print size of the reading mate- 
rial really tax the defect present? 

3. Is the level of reading efficiency such 
that the reading habits make excessive 
demands upon the defective eyes? 


Positive answers to all three questions would, 
in the absence of other explanatory factors, 
create a strong case to show that the disabil- 
ity is due to the visual deficiency. The an- 
swers to these questions, however, are not 
easily found. Good, trained clinical judgment 
is required. 

Some eye defects are within the range of 
tolerance. Compensatory adjustments . may 
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reduce the influence of a given defect to the 
extent that it becomes entirely negligible. 
Individuals with slight muscular imbalances 
may be capable of maintaining the correct 
alignment of the eyes under ordinary reading 
conditions. Hyperopes do not necessarily 
suffer from unusual efforts in accommoda- 
tion (34). All deviations from normal vision 
are not necessarily handicaps to reading and 
to learning to read. Under many circum- 
stances, an eye defect has no significant effect 
upon reading. 

The form of the reading material also de- 
termines the importance of a visual defect. 
In the lower grades where reading is slow and 
the print is large, moderate defects are not 
likely to interfere with reading. On the other 
hand, similar moderate defects might result 
in serious disturbances if small print were in- 
volved. The nature of the printed reading 
material must be considered in determining 
whether or not this material would place real 
strain upon the visual defect present. 

The level of reading efficiency must be con- 
sidered. Slow methodical reading does not tax 
the eyes to any great extent. On the other 
hand, rapid and continuous reading subjects 
the eyes to great stress. The most excellent 
visual mechanism may be taxed by difficult 
and prolonged reading. If the practices of the 
individual require extensive reading, the 
slightest defect may be detrimental. Minor 
defects which are unimportant under ordinary 
circumstances may cause serious difficulty and 
discomfort if strenuous reading habits are 
involved. 

Other factors complicate the matter of de- 
tecting the visual defects which are really 
detrimental to reading. For example, as above 
noted, individual idiosyncracies may change 
any general concept regarding the degree of 
visual defect that may be tolerated without 
serious consequences. Some people show a re- 
markable tolerance for comparatively severe 
defects; others have difficulty with minor defi- 
ciencies. Different individuals with the same 
apparent degree of defect do not necessarily 
suffer the same degree of consequent discom- 
fort or difficulty in reading. Individual differ- 
ences do exist and they must be taken into 
account in the analysis of particular cases. 
The matter of fatigue and eyestrain has many 
aspects. People with eye defects may be ex- 
pected to have more difficulty with visual 
fatigue than those with normal vision. Prob- 
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lems of eye-strain can and do occur in cases 
having no defects as such. Dearborn (15) has 
called attention to the importance of fatigue 
as a factor in reading disability in the absence 
of any manifest visual defect. Numerous phy- 
sical conditions of light, print type and size, 
and the like contribute to eyestrain and 
fatigue. Extensive investigations along this 
line have been conducted by Luckiesch and 
Moss; a number of these are published in 
book form (37). 

Lebensohn has summarized the problem of 
fatigues thus: “Under favorable circumstances 
the eyes are not easily fatigued; but under 
unfavorable conditions in which there are 
difficulties with fixation, accommodation, 
adaptation, or interpretation eyestrain is ex- 
perienced (Lancaster). In an effort to improve 
perception the eye instrument continuously 
attempts slight changes in adjustment, the 
muscles become tense, hyperaemia and conse- 
quent hyperirritability result (Lebensohn) 
and fatigue follows. Over short periods of 
time, however, the eye adjusts itself surpris- 
ingly well to poor conditions.” (38) 

Considering the numerous and complex 
factors which determine the importance of a 
given eye defect in the reading process, no 
criteria can be established regarding the de- 
gree of any defect that may be considered to 
be the probable cause of reading disability. 
The individual’s tolerance or lack of tolerance 
for a given degree of defect cannot be pre- 
dicted. The printed form of the reading mate- 
rial and the reading practices of the individual 
must be evaluated. In every case of reading 
disability factors other than those involving 
the visual mechanism require consideration as 
possible causative elements. The problem is 
not simple. It remains a matter of individual 
clinical analysis and diagnosis. 


Speciric Types or Eye DEFECT 


It may be expected that some eye defects 
would interfere with reading more than others. 
Unfortunately, different eye defects do not 
occur with the same frequency. The informa- 
tion available is determined somewhat by the 
frequency with which the defects occur. More 
data are available on the common eye condi- 
tions; less data, or none, on the rare condi- 
tions. The discussion is naturally limited by 
this matter. In addition to discussing the in- 
formation concerning specific types of defect, 
an effort will be made to point out certain 
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conditions of the eyes which appear to require 
further investigation. 

Hyperopia and myopia.—Reading is “close” 
work. It may be expected, therefore, that far- 
sighted individuals would tend to have more 
difficulty with reading than the nearsighted. 
A number of studies have found support for 
this consideration. Farris (26) reported that 
“Both hyperopia and strabismus are associ- 
ated with less than normal progress in read- 
ing; while myopia and myopic astigmatism 
were both found to be associated with more 
than normal progress.” Blake and Dear- 
born (9) found that “There is a higher pro- 
portion of farsightedness among those having 
difficulty in reading and a higher proportion 
of nearsighted astigmatism among good read- 
ers.” Taylor (49) reported that “Among the 
normals there are more cases tending toward 
myopia, while among the failures there are 
more tending toward hyperopia.” Johnston 
(34) states more cautiously “. . . it is very 
likely that a positive association exists be- 
tween latent hyperopia and low reading 
achievement for subjects similar to the ones 
in this [Johnston’s experimental] group.” 

The existence of this relationship is one of 
the most consistent findings in the literature. 
Some studies have not found the positive rela- 
tionship between visual defects and reading 
but, among those finding a relationship, none 
are known to have reported that hyperopia 
is associated with good reading, and myopia, 
with poor reading. The evidence may be 
affected by the relative frequency or severity 
of these defects in the groups studied; the 
findings are both consistent and reasonable, 
nevertheless. 

Muscle imbalances.—Strabismus was found 
to be associated with less than normal progress 
in reading by Farris. In addition, he stated 
that “Pupils whose visual perception is mo- 
nocular make progress in reading superior to 
those not having correct coordination of the 
eyes.” Selzer (43) tended to eliminate all 
other potential causes of difficulty in reading 
and to over-emphasize the matter of binocular 
vision in stating that “. . . conditions of 
muscle imbalance, and alternating vision, in 
addition to lack of fusion, the writer believes, 
account for such reading disabilities as are 
not accounted for by general mental disabil- 
ity.” Betts (7), also, placed great emphasis 
upon the importance of binocular coordina- 
tion and fusion: “If we were a one-eyed race, 


[| Vol. 13, No. 3 


our reading difficulties would probably be 
few . . . Many of our reading problems are 
directly traceable to a lack of coordination 
between the two eyes and to the probable 
failure of the mind to combine the right and 
left eye pictures for correct interpretation.” 
“Approximately 90% of the severely disabled 
readers are visually characterized by faulty 
binocular coordination and astigmatism.” 

The estimates of Selzer and Betts concerm- 
ing the importance of eye muscle balance con- 
trast greatly with the findings of Fendrick and 
Johnston, for example. Fendrick (28) found 
that “Measures of lateral muscle coordination 
did not yield any evidence that reading dis- 
ability cases manifested a more pronounced 
aberrance in muscle imbalance than the con- 
trol cases;” and Johnston (34) observed: 
“The obtained coefficient, therefore, could 
easily have been derived from a population 
wherein no association existed between lateral 
imbalance and reading achievement.” Other 
studies which found no relationship between 
eye defects and reading (33, 48, 51, 52) are 
also in disagreement with the extreme empha- 
sis sometimes placed upon muscular imbalance 
as a factor in reading disability. Eye muscle 
imbalance must be considered as a potential 
source of difficulty in reading but the need 
for more moderate estimates of its importance 
is apparent. 

The vonstant convergence required for 
efficient reading with both eyes would be more 
difficult to maintain with moderate exophoria 
than with slight esophoria. For this reason, 
it is reasonable to consider exophoric tenden- 
cies to be more detrimental to reading than 
esophoric tendencies. On two occasions, Eames 
has reported a relationship between exophoric 
tendencies and difficulty in reading: “The 
principal fact brought out by this study is 
that, in reading disability, the eyes are more 
exophoric at the reading distance.” (21) An- 
other report considered that “The amplitude 
of fusion convergence may be expected to be 
low among non-readers . . .” and that “The 
amplitude uf fusion convergence is an impor- 
tant factor in reading disability.” (23) 


Good (30) also found that a control group 
of good readers showed great superiority over 
an experimental group of poor readers in the 
measure of adduction power. Although Witty 
and Kopel did not consider their data to sup- 
port the existence of a relationship between 
visual functions and reading, it was noted that 
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the poor readers demonstrated a higher pro- 
portion of “slow fusion”’. 

/ It is interesting at this point to speculate 
on the nature of esophoric tendencies. If, as 
Eames and Good have found, the amplitude 
of fusion convergence or adduction power is 
associated with reading ability, it is possible 
that this function is developed in reading. 
When poor readers show weakness in adduc- 
tion power, it may not be a basic physiological 
deficiency but a lack of proper development 
of eye muscles for reading. Since reading re- 
quires constant convergence, it may be a 
factor in the development of adduction power. 
If this were true, good readers would develop 
and so demonstrate a greater “amplitude of 
fusion convergence’, whereas those who read 
little would not develop this eye muscle char- 
acteristic to the same degree. This possibility 
could be checked by a study of the adduction 
power of school children in the various grades. 
If improvement in adduction power were 
found to be associated with the development 
of reading, its significance as a cause of read- 
ing disability would have to be questioned. 
Adduction power may be the result of read- 
ing; its absence or weakness might be trace- 
able to the lack of reading activity. 

Although there has been some tendency to 
place excessive emphasis upon the importance 
of muscular imbalance as a cause of reading 
disability, this type of defect must be consid- 
ered as one which may seriously affect read- 
ing. Certainly severe cases of constant, con- 
scious diplopia (8 can be quite disabling. 
Exophoric tendencies and adduction weak- 
nesses appear to be associated with poor read- 
ing; further study is necessary to determine 
whether this is a causal relationship. 

The mechanism through which muscular 
imbalances may interfere with reading has 
two different aspects. First, muscle imbalance 
may affect the psycho-physiological functions 
involving the proper fusion of the images per- 
ceived. Second, eye muscle imbalance may 
affect the efficiency of eye movements. A dis- 
cussion of the first mechanism would be 
lengthy and involve some departure from the 
real content of this paper. It may be noted, 
however, that superficial and intangible dis- 
cussions of the subject are all too numerous. 
The second, however, concerns eye movements 
which appear to require more attention. 
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Eye movements.—Before considering the 
problem of defective eye movements, it is 
desirable to digress on two points. 

First, it is necessary to distinguish between 
faulty eye movements and eye movements in 
faulty reading. The fact that poor readers 
make a greater number of fixations and re- 
gressions than do good readers, and take more 
time in so doing, does not necessarily indicate 
that poor readers have faulty eye movements. 
The differences ordinarily found between the 
eye movements of good and poor readers are 
normal functions of the level of reading abil- 
ity. Tinker (50) has already called attention 
to the fact that inefficient reading is accom- 
panied by characteristic oculomotor behavior 
which is a function of the “perceptual and 
apprehensive process.” This does not preclude, 
however, the possibility that there are truly 
abnormal eye movements associated with de- 
fective eye conditions, as such, and not with 
poor reading, as such. 

The second point for digression ¢oncerns 
the use of eye movement photography to 
measure reading ability. It has been advocated 
that “A binocular eye movement photograph 
or reading graph is indispensable in any com- 
prehensive visual or reading examination, as 
by this means only can the irregularities 
shown in the behavior of the eyes while they 
are actually at work be objectively deter- 
mined. The maturity of the reading habit, as 
indicated by the mechanical efficiency shown 
by the photographic record, is the only objec- 
tive information which permits comparison 
between groups or individuals and furnishes 
a definite prognostic test for checking correc- 
tive measures.” (49) It is agreed that binoc- 
ular eye-movement photography is indispens- 
able in a comprehensive visual examination. 
The importance of eye-movement photog- 
raphy is indispensable in a comprehensive 
visual examination. The importance of eye- 
movement photography in a reading examina- 
tion is, on the other hand, seriously ques- 
tioned. 

Certain measures of eye movements cor- 
relate with reading ability. It does not follow 
that eye movements provide an adequate 
measure of reading ability. The reliability of 
eye-movement measures, as such, has not been 
found to be entirely satisfactory. (10, 33) 
Quite aside from the question of the reliability 
of photographic measures, however, such in- 
direct and incomplete measurement of reading 
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is not recommended. Direct measures of read- 
ing ability, standard achievement tests, formal 
and informal diagnostic tests, certainly pro- 
vide more adequate, efficient, and suitable 
measures of reading ability. Eye movement 
measures cannot, under any circumstances, 
match the efficiency and thoroughness of 
reading tests. 

The true purpose of the analysis of eye 
movements by photographic means should not 
be the measurement of reading speed or the 
frequency of fixations and regressions. The 
real information to be obtained in eye-move- 
ment photography concerns the existence or 
non-existence of abnormal, truly “faulty” eye 
movements. Nystagmus, poor coordination, 
unusual convergent and divergent movements 
are not easily detected or adequately evalu- 
ated by other, more static means of observa- 
tion. The real usefulness and value of eye- 
movement photography rests in the investi- 
gation of these defects. 

Eye-movement photography provides a 
technique for studying muscle defects as they 
affect reading. Muscular defects such as gross 
imbalances and duction weaknesses can be 
detected and measured by comparatively 
static methods; but the functional importance 
of these defects, particularly the milder forms, 
upon the coordination of the eyes should be 
evaluated by the dynamic technique of eye- 
movement photography. Individuals having 
duction weaknesses or muscular imbalances 
may or may not show irregularities in eye 
movements. It is possible, also, that individ- 
uals who show no defects in the usual tests 
may demonstrate the presence of some abnor- 
malities when the eye movements are exam- 
ined. 

The abnormal eye movements to be studied 
by photographic means may be classified as 
follows: (1) Nystagmus. (2) Lack of coordi- 
nation of the eyes. (3) Abnormally large or 
irregular convergent and divergent movements 
during movement and during fixation. 

Chronic nystagmus is an exceedingly rare 
condition. Its effect upon reading is obvious. 
Since nystagmus is fundamentally a neuro- 
logical condition, it is not necessary to con- 
sider it among visual defects. The eye move- 
ments involved would provide excellent mate- 
rial for investigation by means of eye-move- 
ment photography. The eye movements in 
reading, variations in the nystagmic move- 
ments, if any, under varying conditions could 
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be analyzed in detail. Photography also pro- 
vides excellent means for studying the nature 
of artificially induced nystagmus. However, 
the problems involved in nystagmas are, for 
the most part, physiological. 

Gross incoordination of the eyes may be 
detected by photography. One eye may lead 
the other when there are changes in the point 
of fixation. When fixating a particular point, 
one eye may detach itself from that point 
with the result that the visual functions are 
carried on by one eye only. These abnormal- 
ities may be expected in cases showing gross 
muscular imbalance. The possibility that they 
exist in cases with comparatively minor imbal- 
ances should be checked. 

The convergence of the eyes in saccadic 
movements and the divergence of the eyes 
during fixation appear to provide the most 
common and the most interesting subject for 
treatment. The significance of these move- 
ments is not entirely clear but they may be 
observed in nearly all subjects. It appears 
that Schmidt (42) made the first note con- 
cerning these movements in reading. Clark 
studied the relationship of these movements 
to exophoria (11, 14), to esophoria (13), and 
to inter-fixation distance (12). Stromberg in- 
vestigated the relationship to reading (46) 
and to lateral muscl: imbalance (45). Mc- 
Farland, Knehr, and Berens (39, 40) obtained 
data on the magnitude of divergence during 
fixation in relation to reading, visual defects, 
and anoxemia. 

Using a group of subjects with marked 
exophoric conditions and a matched group 
with normal phoric conditions, Clark found 
that the magnitude of the divergent move- 
ment in the first fixation of each line of read- 
ing material was greater for the exophoric 
subjects. No other differences were found; the 
usual eye movements, fixations, regressions, 
etc. did not differentiate between the groups 
studied. In another paper (13), this author 

rted on a single case of esophoria. It was 
found that this esophoric subject demon- 
strated divergent movements in fixations 
which were greater than those of the normal 
group but not as great as those of the 
exophoric subjects. 

If poor reading is associated with exophoric 
tendencies (Eames) and exophoric conditions 
are associated with the magnitude of diver- 
gence during fixation (Clark), it is reasonable 
to expect that a correlation would be found 
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between reading ability and the magnitude of 
the divergent movements. Stromberg, how- 
ever, did not find any relationship between 
phorias and the speed of reading (44), be- 
tween lateral muscle imbalance and the mag- 
nitude of divergence in fixations (45), or be- 
tween measures of divergence and reading 
speed (46). Unfortunately, no generally con- 
sistent trend appears in the findings of these 
investigators. 

Other information on convergent and 
divergent movements in reading point to 
several possible relationships of interest. The 
magnitude of the divergence during fixation 
has been found to be related positively to the 
interfixation distance (12). McFarland, 
Knehr, and Berens (39) compared the eye 
movements of an experimental group of sub- 
jects with miscellaneous visual defects, with 
those of a control group having no significant 
visual defects. They found a tendency for the 
experimental group to be slower readers, to 
require more fixations and regressions, and to 
make larger divergent movements during fix- 
ation. In another paper (40), these authors 
report significant decreases in the magnitude 
of the divergent movements as one of the 
effects of oxygen deprivation. 

Extensive work is needed to determine the 
true significance of these most interesting and 
minute convergent and divergent eye move- 
ments. The function appears to be quite con- 
sistent. Measurements made by Knehr at the 
Harvard Psycho-Educational Clinic in 1935 
indicated that each fixation in reading is 
accompanied by some divergent adjustment. 
The largest divergence occurs in the first fix- 
ation of a line of reading matter and the tend- 
ency to make large or small movements on the 
first fixation is correlated with the tendency 
to make large or small movements in all fixa- 
tions. An analysis of Mr. Knehr’s data showed 
a correlation of .54 between the average diver- 
gence of the first fixations of each line and 
the average divergence of all other fixations. 
The reliability of these measurements remains 
to be determined, however; the problem is not 
simple, especially when an effort is made to 
determine the actual eye movement involved. 

The measurement of minute eye movements 
by the usual photographic means is not en- 
tirely satisfactory. Parallel rays of light re- 
flected from the surface of the cornea can be 
brought to focus on a moving photographic 
film. Any subsequent movement of the eyes 
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will result in a change in the track made by 
the light falling upon the film. Gross move- 
ments of the eyes are recorded adequately, but 
the photograph does not always provide a 
sufficiently good record to permit the accurate 
measurement of the small convergent and 
divergent movement of the eyes. The exact 
points on the film that may be taken as the 
beginning and the end of a fixation are not 
easily located. Slight inaccuracies in the focus 
of the corneal reflections seriously affect the 
width and clarity of the tracings. The amount 
of light used and the speed of the moving 
film are important variables, also. Some im- 
provements in the eye movement camera 
would be required for the purpose of obtain- 
ing good records of convergent and divergent 
eye movements. 


In attempting to estimate the actual move- 
ment of the eye which is indicated by slight 
deviations on the film track, many difficulties 
are encountered. The physical characteristics 
of the camera can be determined accurately. 
From these, it is possible to interpret a given 
movement shown on the film in terms of the 
distance between the points of reflection in 
space, not on the cornea. This does not pro- 
vide a measure of the actual movement of the 
eye since the point of reflection on the cornea 
is variable. In order to transpose a given 
movement of the point of reflection into terms 
of the angular movement of the eye, it is 
necessary to take into account the curvature 
of the cornea, the source of the light, the angle 
of reflection, the center of rotation of the eye, 
the track of the successive points of reflection 
on the cornea, etc. The task is extremely for- 
midable. The practical approach may require 
the calibration of the eye-movement camera 
for each subject, and perhaps for each sample 
of data collected. 


Many problems remain to be solved in an 
effort to obtain accurate measures of the 
divergent eye movements during fixation. The 
track on the film obtained in eye-movement 
photography is somewhat irregular in both 
width and clarity. This results in unsatisfac- 
tory measurements. In addition, the extent of 
the lateral shifting shown by the track on the 
film cannot be interpreted directly in terms 
of the actual lateral rotation of the eye. 
Although work is necessary in the develop- 
ment of accurate measures, it is well estab- 
lished that convergent and divergent eye 
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movements do exist. Furthermore they can 
be measured. 


Several matters regarding the convergent 
and divergent movements of the eyes during 
reading require investigation. First, the rela- 
tionship between these movements and mus- 
cular imbalances should be determined. It is 
possible that these movements provide a suit- 
able means for evaluating the disturbances of 
coordination that occur with obvious or latent 
muscular defects. If this were true, a practical 
technique would be available to evaluate the 
importance of muscle imbalance in reading 
and to check the effects of orthoptic training. 
Second, it should be determined whether or 
not convergent and divergent eye movements 
are related to reading ability as such, the 
growth of reading ability, or the maturity of 
the muscular coordination of the eyes. 


Aniseikonia.—Measurable differences in the 
size and shape of ocular images have been 
studied extensively by the Department of 
Research in Physiological Optics of the Dart- 
mouth Medical School. The first extensive 
descriptions of aniseikonia, its measurement, 
its significance, and its correction, appeared 
in 1932 (1, 2, 3, 4, 5, 41). Although the con- 
cept and mechanics of aniseikonia involve cer- 
tain peculiarities and may be complicated by 
other types of visual defect (36), aniseikonia 
is a measurable entity which undoubtedly 
deserves recognition among visual defects. 


As in the case of other eye defects, anisei- 
konia of sufficient magnitude may be detri- 
mental to reading. An early report by Dear- 
born and Comfort (18) suggested the possible 
relationship between the degree of aniseikonia 
and reading disability; the data involved were 
gathered from miscellaneous cases studied at 
the Harvard Psycho-Educational Clinic. 
Dearborn and Anderson (17) reported a sig- 
nificant difference between the amount of 
aniseikonia found in an experimental group 
of poor readers and that found in a control 
group. This difference was most marked when 
the size discrepancies were measured at read- 
ing distance. 

A relationship between aniseikonia and 
reading ability was not found by Imus, Roth- 
ney, and Bear (33). In fact, their data tended 
to reverse previous findings in that “Those 
diagnosed as having severe ocular defects or 
severe aniseikonia made higher scores in 
nearly every item.” It does not follow, how- 
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ever, that severe aniseikonia is recommended 
for good reading. 

This difference in findings affords a good 
illustration of the necessity of keeping in mind 
the considerations discussed earlier in this 
paper. In the first place, the population sam- 
pling differed in the two studies: That of 
Imus, Rothney, and Bear dealt with college 
students, all of whom could read well enough 
to get into college, and many of whom may, 
during their years of schooling, have compen- 
sated for or surmounted any handicaps of a 
visual nature which might affect their reading. 
The investigation of Dearborn and Anderson 
dealt with elementary and high school stu- 
dents (with two or three exceptions) who 
ranged in age from nine to eighteen years and 
who were retarded on the average by approxi- 
mately three years in reading ability. Sec- 
ondly, the method and means of diagnosis of 
the visual condition were quite different. In 
the experiment of Dearborn and Anderson the 
diagnosis of aniseikonia was made on the basis 
of direct measurement with the apparatus de- 
vised for this purpose by Ames (3) and by 
the accepted or standard technique. In the 
case of the second experiment, the diagnosis 
was made on the basis “of the individual's 
performance on the tilting field” (a method 
which may demonstrate the presence of the 
condition, but the reliability and validity of 
which as a measure of the degree of the defect 
had not been determined at the time of the 
experimentation). 

Finally, the paradoxical finding as to the 
reading skills of aniseikonic students is based 
on 20 cases diagnosed (by the above noted 
tilting field) as instances of “severe” anisei- 
konia. An examination of the record shows, 
however, that while these individuals were in- 
deed somewhat better on the average in the 
skills of reading; they were far ana away 
Superior as a group in scholastic aptitude or 
general intelligence (as determined by a psy- 
chological test) than was the main body of 
students who had no ocular defects. The 
superior intelligence of these aniseikonic sub- 
jects may thus have offset whatever handicap 
to reading their visual conditions may have 
imposed. 

THe CLINICAL EVALUATION OF 
VisuaL DEFECTS 

Reading is a complex function which is 
affected by many things. A visual defect does 
not necessarily lead to deficient reading. Simi- 
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larly, good vision does not always result in 
good reading ability. Vision is only one of the 
many factors which may contribute to diffi- 
culty in reading. The factors which influence 
the development of good reading are equally 
numerous. Consequently, when a case of read- 
ing disability is found to have a visual defect, 
many things must be considered. The visual 
defect may be (1) the primary cause, (2) a 
contributing element, or (3) a totally insig- 
nificant factor in the development of the dis- 
ability. The evaluation of the importance of 
the given visual deficiency requires careful 
analysis of the individual case. This can be 
done only by means of a thorough clinical 
study. 

The traditional concept of the scholar wear- 
ing thick glasses groping his way through the 
library stacks is not wholly fictitious. The 
accumulation of evidence which, in all prob- 
ability, led to the development of this concept 
is understandable. The “bookworm”’, the stu- 
dent, does not always have good vision. In 
fact, a wide variety of types and degrees of 
visual deficiency can be found among good 
readers. For this reason, it cannot be assumed 
that the existence of an eye defect does great 
and permanent damage to the reading process 
as such. Certainly, experience with individual 
cases shows that people with visual defects 
can be good readers. 

On the other hand, the best possible visual 
function for reading purposes is obviously de- 
sirable. Whenever improvement in visual 
function can be obtained, the necessary meas- 
ures should be taken. The use of corrective 
glasses should be encouraged if vision can be 
improved without adverse effects. Studies 
have been made to determine the effect of 
eye corrections upon reading. Farris (26) 
found that lenses are aids to achievement. 
Children with corrected hyperopia and stra- 
bismus showed greater gains in reading than 
those with similar, uncorrected defects. This 
finding did not hold for myopic cases, how- 
ever. Eames (22) also noted some improve- 
ment in reading disability cases when visual 
defects were corrected. 

In some cases, the correction of eye defects 
may eliminate the cause of difficulty in read- 
ing. It should not be assumed, however, that 
the wearing of glasses results in unusually 
good or normal vision in every case. Correc- 
tions are.not perfect. Eames (25) has pointed 
out that children wearing lasses do not pass 
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all tests of vision by virtue of the corrections 
worn. Frequently, defects of some kind re- 
main. If visual defects are discovered, they 
should be corrected insofar as this is possible. 
Under no circumstances, however, can the 
analysis and treatment of reading disability 
be stopped at this point. 

The mere existence of a defect in vision 
does not indicate necessarily that the reading 
disability is caused by that defect. Other mat- 
ters must be considered. All of the problems 
discussed in this paper and many others con- 
cern the individual case. Although the correc- 
tion of visual deficiencies is very desirable, 
the potential effectiveness of the correction 
depends upon the determination of whether 
or not the eye condition caused the reading 
disability. The causal relationship, if any, be- 
tween the visual condition and the reading 
disability in any particular case is not easily 
determined. In order to be certain that a 
visual defect really accounts for a disability, 
it is necessary to observe that: (1) The exist- 
ing visual defect is sufficient to make the act 
of seeing and reading difficult at the level of 
ability and practice involved. (2) No other 
potential course of disturbance appears to a 
sufficient degree to account for the difficulties 
in reading. The first item has been discussed. 
The second is beyond the scope of this paper. 

The evaluation of visual conditions as 
potential sources of difficulty in reading re- 
quires consideration of many problems. The 
evaluation of other factors which influence 
reading (16, 19, 29, 32, 54) may sometime 
involve similar complexities. Each case of 
reading disability presents a different combi- 
nation of characteristics with regard to the 
numerous factors which may contribute to 
difficulty in reading. For this reason, general 
rules cannot be applied to obtain the correct 
diagnosis in any particular case. Although sur- 
vey studies of large groups serve to point out 
the possible sources of difficulty and to direct 
the clinician’s attention to the matters to be 
considered most seriously, they cannot provide 
any fixed rules which will reveal the cause of 
reading disability in any single case. Each 
individual presents a special problem of diag- 
nosis. Only after careful study of each case 
can the true or the most probable source or 
sources of difficulty be established and the 
appropriate treatment instituted. Sometimes 
the source of difficulty is a visual defect. 
When this is true, the treatment should in- 
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clude the correction of the defect, if possible, 
and the appropriate remedial instruction. 

The place of the teacher in the treatment 
of reading disability is unique. Unfortunately, 
the teacher may sometimes be the cause of 
the difficulty. The teacher, however, is always 
involved in the successful treatment of read- 
ing disability. Regardless of the source of the 
disability, the individual involved must be 
taught to read and must learn to read. What- 
ever the clinical study may reveal as the cause 
of the disability and whatever the corrective 
measures may be, the teacher remains the 
ultimate factor in successful treatment. A good 
teacher and good teaching remain the real 
essentials after all defects are removed or cor- 
rected insofar as it is possible to do so, In 
fact, should the defect be such that correction 
is not possible, proper training remains as the 
only resource possible and, frequently, a great 
deal can be accomplished when non-correctable 
difficulties are taken into account in setting 
up a training program. 

The determination of the cause or causes 
of particular reading disabilities is not a 
simple matter. In any one case, there may be 
factors, major and contributing causes, in- 
volved. Vision, as‘already noted, may be the 


only cause, a contributing cause, or an insig- . 


nificant factor in any case. Careful study of 
each individual is necessary before the true 
situation can be determined. The breadth of 
clinical analysis is most important to the 
understanding of the problems concerning 
visual defects and reading. 


SUMMARY 


1. The coordination of the information on 
visual defects and reading is complicated by 
great differences among studies with respect 
to (a) the basic populations and the samples 
of these populations studied, (b) the general 
methodology of the investigations, and (c) 
the functions measured, the measuring instru- 
ments used, and the criteria applied to dis- 
tinguish between good and poor vision, be- 
tween good and poor reading. 

2. The existence of a positive relationship 
between vision and reading is axiomatic. 

3. Material concerning visual defects and 
reading is grouped under three major ques- 
tions in an effort to clarify (a) the general 
importance of visual defects to reading, (b) 
the severity of an eye defect may be regarded 
as detrimental to reading, (c) the specific 
types of eye defect which affect reading. 


JOURNAL OF EXPERIMENTAL EDUCATION 


[Vol. 13, No. 3 


4. No data appear to provide a reasonable 
report on the general importance of visual 
defects as disturbing factors in reading. No 
agreement can be found in the literature on 
the subject. 

5. In order to determine the severity of an 
eye defect which is detrimental: to reading, it 
is necessary to consider (a) the capacity of 
the visual mechanism to compensate for some 
defects, (b) the extent to which the reading 
material taxes the defective eyes, (c) the level 
of reading efficiency and practice involved. 
Individual idiosyncracies and tendencies to- 
ward eyestrain and fatigue are also matters 
for consideration. Individual differences indi- 
cate that the problem is one of clinical evalu- 
ation in each case. 

6. Under specific eye defects, literature on 
hyperopia, myopia, muscle imbalances, eye 
movements, aniseikonia and reading is re- 
viewed and an effort is made to discuss the 
significance of the information on these de- 
fects. In considering eye movements, it is 
suggested that (a) faulty eye movements be 
distinguished from eye movements in faulty 
reading, and (b) eye-movement photography 
be not regarded as a means for measuring 
reading as such. The true value of eye- 
movement photography lies in the detection 
of faulty eye movement as such; related prob- 
lems are discussed with special reference to 
convergent and divergent eye movements in 
reading. 

7. The clinical evaluation of visual defects 
in cases of reading disability is discussed 
briefly. 
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CONTRIBUTIONS OF TEACHERS TO DENTAL HEALTH 
KNOWLEDGE AND BEHAVIOR OF STUDENTS* 


Lean 


INTRODUCTION TO THE PROBLEM SITUATION 


Educators today generally accept a concept 
of education which calls for a curriculum 
based upon the needs of youth both individual 
and social. That health is a need of youth and 
that it is incumbent upon the schools to pro- 
vide for this need through health education 
need not be belabored. 

Need for dental health education — 
Whether dental health education, as one 
aspect of general health education, should be 
included in the health education curriculum 
on the junior-high and senior-high school 
levels, must again be determined on the basis 
of need, namely, the dental health needs of 
adolescents. 

Surveys of the prevalence of dental caries 
among school children of all ages in many 
parts of the country indicate that dental 
caries is an almost universal and a continuous 
disease. Universal in that it is widespread 
throughout this country’ and continuous in 
that it attacks each age group repeatedly. 

A survey conducted in twenty-six states in 
1933-1934 by the United States Public Health 
Service in conjunction with the American 
Dental Association disclosed that 90% of 
1,438,318 school children aged 6 to 14 years 
had carious teeth that had not been filled.* 
Klein, Palmer and Knutson found that of 
1,236 school children aged 12 to 15 years in 
Hagerstown, Maryland, 77.4% had one or 
more carious unfilled permanent teeth.* In 
Georgia 72% of 161,000 white school children 
examined by dentists in 1938-1939, were in 


* Abstract of dissertation presented to the faculty of the 
Yale Graduate School in Candidacy for the Degree of Doctor 
of Philosophy. 

+The writer was awarded the Maron Talbot Fellowship 
4 1943-1944 by the American Association of University 
omen. 

1John D. Ratcliff, “The Town a Toothache.” 
Colliers’ (December 19, 1942) pp. 58-59 

One exception to the + of dental caries in the 
United States was reported b George W. Heard. This 
report concerns Deaf Smith L§ Texas which is unique 
in that dental caries is seldom found to attack the residents 
of this town. 

2C. T. Messner et al, Public Health Bulletin No. 226. 
(Washington: Government Printing Office, 1933-1934). 

o oe Klein, Carroll Palmer and J. W. Knutson, “Studies 

Dental Caries.’ ’ Public Health Reports, Vol. 53, No. 19 
(lay 13, 1938) p. 7. 
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need of dental care.‘ In the District of Colum- 
bia, the 1942 annual report indicated that 
72.9% of 18,869 junior-high school students 


had one or more carious teeth needing fillings.” 


Of the 7,144 colored children in this group 
85.4% needed dental care® and of the white 
population in this junior-high group 65.3% 
needed dental care.” 

In New Haven, dental examinations made 
in 1937 of 2,356 children of average and high 
economic areas disclosed that 77% needed 
fillings; and in 1938, examinations of 2,785 
children, representing an economic cross sec- 
tion of the school population, indicated that 
85% of the group had carious teeth that 
needed fillings.* And more recently, in 1942- 
1943, dental examinations of 727 eighth grade 
students in New Haven schools, located in 
low economic areas, indicated that 87% of 
the group had one or more permanent teeth 
in need of dental care.’ 


Additional evidence that dental health is a 
common need of adolescents lies in the fact 
that children of each older age group exhibit 
increased dental disease in the form of unfilled 
caries and in the form of lost first permanerit 
22, 22 

The need for including dental health educa- 
tion in the general health education course is 
expressed by the joint committee of the NEA 
and AMA in the following terms: 


“To teach health in the ideal manner no 


phase should be ignored. The dental aspect 

*The Teacher and the Dental Health Education Program. 
State of Georgia, Department of Public Health (1940) p. 3. 

5 Dr. A. Harry Ostrow, Annual Report of the Bureau of 
Dental Services. District of Columbia (1942) Table 2A. 

Jbid., Table 4A. 

7 [bid., Table 3A. 

Leah Gold, Study of the New Haven Dental Program 
for the years 1937- 1938,” Journal of American Dental Hy- 
giene Association, Vol. XIV, No. 3 (1940) p. 143. 

* Data collected for this investigation. 

aaa Klein, Carroll Palmer and J. W. Knutson, op. cit., 
P. 

13C. E, Turner, Percy Howe and Marita Dick, “‘A Usable 
Dental Health Index.” Journal of School Health, Vol. XII, 
No. 2 ye 1942) p. 53. 

12 Leah Gold, “Dental Health of School Children as Deter- 
mined by the "Lost First Permanent Molar Index.” Journal 
American — es Public Health Dentists, Vol. Il, No. 4 
(October, 1942) p. 
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should always be considered, because there 
is no ailment which is as universally present 
among children as is dental disease.’”* 


The committee on the AASA allow for the 
inclusion of dental health education in the 
junior- and senior- high school health course 
on the basis that “The selection of subject 
matter for any particular group will depend 
upon interests, needs and backgrounds.”** 

Attempts to meet the need through the 
schools —In recognition of the need for dental 
health education, various programs attempt- 
ing to meet the dental needs of school children 
have been developed in many parts of the 
country. “A Survey of Mouth Hygiene Pro- 
grams for School Children” in cities of 150,000 
to 500,000 population disclosed, however, that 
a great deal of diversity exists between these 
dental programs, in organization, administra- 
tion, personnel employed and procedures fol- 
lowed.** 

Out of this diversity four general patterns 
emerge .which may be described briefly as 
follows: 


1. The program is conducted by dental per- 
sonnel as a thing apart from the school cur- 
riculum. Dentists provide actual dental serv- 
ices, generally for children in the first three 
or four school grades and at times for dental 
indigents of higher school grades. Dental 
hygienists assist the dentists at the chair, keep 
records, clean teeth, examine teeth, follow up 
urgent cases and give individual dental health 
instruction. 

2. The second program, like the first, is 
conducted by dental personnel. Dentists give 
limited services to children in the first three 
or four school grades and some provision is 
made for older children of the dental indigent 
group through clinics or service clubs or both. 
This program differs from the first in that the 
dental hygienist devotes her time solely to 
educational activities, including examinations 
for dental defects, classroom and individual 
guidance in dental health, parent education 
and follow-up of children needing dental care. 
In addition, community dentists co-operate in 
this program by providing dental care for in- 
digents at reduced fees. This type of program 


38 National Educa’ Health Education. (Wash- 
ington: N.E.A., 1941) Pp. 


34 American Association 
Schools. 20th Year Book (Washington: 

% The Cleveland Child Health Association, A Survey of 
Mouth Hygiene Programs for School Children. 
Cleve'and (1938). 


Administrators, Health in 
AASA, 1942) p. 76. 
Section 2. 
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is also more or less distinct from the general 
educational curriculum. 


3. The third pattern employs, officially, no 
dental personnel. The program is solely a com- 
munity venture, carried on in the form of a 
campaign in which unofficial community 
dental groups take a leading part. During the 
campaign period the public schools are called 
upon to co-operate by emphasizing dental 
health through class instruction. 


4. The fourth type of program provides for 
organized dental health education by teacher 
participation in the program. Health instruc- 
tion is viewed as an integral aspect of the 
general education curriculum. The dental 
hygienist, in capacity of specialist and con- 
sultant, provides the teachers with source 
material in the field of dental health and 
assists the teachers in handling special dental 
problem cases. The dental hygienist also 
makes dental examinations of specific age 
groups. The dental services for children in the 
first three school grades and for dental indi- 
gents of the upper grades are provided 
through dental clinics. Where clinic facilities 
are inadequate, private dentists are called 
upon to provide dental care to indigents at 
reduced rates. In many cases funds for these 
services are provided by service clubs or other 
unofficial organized groups. 

Criticism advanced against the dental pro- 
grams.—The criticisms advanced against pat- 
terns one and two are primarily concerned 
with the fact that these programs are not in 
keeping with present day educational theory." 
Dental health education in these programs is 
treated as a separate and distinct process, 
apart from the rest of the child’s development. 
Thus it ignores the now generally accepted 
concept of educating the “whole child.” An- 
other criticism of these programs pertains to 
the cost of employing large numbers of spe- 
cialists for conducting this one aspect of edu- 
cation and caring for this one aspect of 
health.” 18 

The third pattern which features the dental 
campaign by dental specialists is viewed as an 
undesirable educational practice. Campaigns 


°C. E. Turner, Can A Dental Service Program Be An Edu- 
cational Venture? Contribution No. 173. Department of 
Biology and Public Health, Cambridge: Massachusetts Institute 

of (1940). Reprint. 
D. Chope, “An Administrator Cogitates His Dental 


Progam,” The Journal of School Health, Vol. X, No. 2 
(February, 1940) p 


Kenneth A Practical Pro- 
E3 Most Officers Can Have,” Journal of School 
‘ealth, Vol. No 4 (April, 1942) p. 124. 
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serve to stimulate only temporary interest and 
so are inadequate for problems that call for 
continuous interest. 

The fourth pattern, by incorporating dental 
health education into the school curriculum, 
takes cornizance of the fact that dental health 
is an aspect of a child’s development and so 
should be the concern of the regular grade 
teacher. Furthermore, since dental health is a 
continuous problem it can receive continuous 
attention when incorporated into the school 
teaching program. Finally the fourth pattern 
is supported by educators because it reduces 
the need for a large staff of specialists in 
dental health and so reduces the cost of the 
dental program.?”: 2; 2%, 28 

The problem.—Although school admin- 
istrators and health education authorities 
generally view with favor the dental health 
program in which the regular grade teachers 
participate in the educational aspects of the 
dental program, there is nevertheless little 
agreement concerning the degree to which 
teachers should participate in order to con- 
tribute most effectively to the dental health 
of students. 

If the aim of health education is to encour- 
age “the development of right attitudes and 
habits as well as sound knowledge in the field 
of health”** or to put it in other terms, to 
develop “desirable practices, attitudes and 
understandings,” it follows that dental 
health education in the hands of the teacher 
should not only increase student knowledge 
of dental health factors, but should also in- 
fluence students to practice personal dental 
hygiene, obtain necessary dental care, select 
approved tooth brushes and dentrifices and 
eat foods that contribute to dental health. 


INVESTIGATION INTO THE PROBLEM 
r The investigation, conducted in New Haven 


| tember 1942 through May 1943, was under- 
_ taken in an effort to evaluate the contributions 
| made by regular grade teachers to the dental 
health knowledge, judgment and behavior of 
| adolescents, through different degrees of 
teacher participation in class dental health 
education and in individual dental health 


*C. E. Turner, op. cit. 
” Dr. H. D. Chope, of. cit., p. 36. 
» P 


urner, Principles of Health Education. 
D. “Health 1932) p. 12. 
® AASA, op. cit., p. 50. 
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follow-up. The dental programs were directed 
by a dental health educator.** 

The sample examined.—Seven hundred and 
twenty-seven eighth grade students aged 12 
to 15 years (as of last birthday) selected from 
seven schools containing similar nativity stock 
and situated in low economic areas (i.e. mean 
income below $1 500),”’ were tested for 
dental health knowledge and judgment and 
examined for dental needs. 

From this total group, two major uncor- 
related groups were formed, experimental 
Group A and control Group B, matched on J 
sex ratio, mean knowledge scores and sigmas.” 
These major groups were later subdivided 
into four minor correlated groups of 65 stu- 
dents each for the purpose of making more 
precise evaluations. Experimental Group A 
was broken down into Groups E and F and 
control Group B provided Groups G and H. 
In addition, two ninth grade groups, experi- 
mental Group C and control Group D, each 
consisting of 74 students ages 12 to 15 years, 
matched on mean knowledge scores, sigmas 
and sex ratio, were selected for the purpose 
of determining the relative carry-over value of 
different degrees of teacher participation in 
dental programs. 

Experimental and control groups had been 
exposed for four months to a dental health 
education program directed by a dental health 
educator. Each dental program provided “in- 
cidental” instruction in dental health through 
various related courses. An average of thirteen 
annual hours was devoted to this “incidental” 
dental health education. Three fourths of this 
time was given to the study of nutritional 
aspects. 

In the experimental programs, the teachers 
provided additional dental health education 
through an organized health course, and 
ticipated in the individual follow-up of stu- 
dent dental health needs. These programs 
were therefore, referred to, in this study, as 
Full Teacher Participation Programs. Anal- 
yses of the planned dental health education 
class periods in the Full Teacher Participation 
programs indicated that major emphasis was 
given to the subject matter areas of dental 
care and hygiene, diet, consumer knowledge 
and dental disease, while the areas of dental 
structure, function and development received 


2° Dental Health educator » an individual who is both an 
educator and a dental hygienist 

_ * Maurice R. Davie, “Patterns ¢ Urban Growth,” Studies 
in the Science of Society. George P Murdock, editor. (New 
Haven: Yale University fen, 1937) p. 158. 
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minor attention. Further analyses of these 
class periods indicated that the four teachers 
observed, made use of the lecture-method, 
discussion-method, teacher-directed-questions 
visual aids and student-directed questions. 

The control programs provided little or no 
teacher participation in class dental health 
education beyond the “incidental”, and no 
participation at all in the follow-up aspects. 
These programs were therefore referred to, in 
this study, as Slight and No Teacher Partici- 
pation programs. 

The correlated experimental Groups E and 
F were both exposed to Full Teacher Partici- 
pation programs, which differed, however, in 
minor respects. Group E program included 
twelve class periods or nine annual hours of 
dental health education, while Group F pro- 
gram had only six class periods or four and 
one-half annual hours of similar instruction. 
Furthermore, the follow-up program of Group 
E provided two minute individual conferences 
conducted within the hearing of the entire 
class, and made use of marks as incentives 
for obtaining dental corrections, while the 
follow-up program of Group F provided an 
average of eight minutes for individual con- 
ferences conducted privately and marks were 
not offered as rewards or punishments for 
dental corrections obtained. 

The correlated control Groups G and H 
were alike in all respects but one, namely, the 
Group G program provided in addition to the 
“incidental’ dental health education, three 
annual hours of instruction in this area, ad- 
ministered sporadically in brief periods, while 
the Group H program provided for no instruc- 
tion beyond the “incidental”’. 

Methods of gathering data.—The knowl- 
edge and judgment data were obtained by 
means of a written Test in Dental Health 


_Knowledge and Judgment, constructed and 


validated for purposes of this investigation. 
This test was administered to the study groups 
both at the beginning and at the end of the 
four month experimental period. The diet data 
were obtained by means of a questionnaire 
provided on this same test, the first page of 
which called for recordings of “Foods Eaten 
Yesterday”. 

The data on dental health behavior, in 
terms of dental care obtained, mouth hygiene 
practiced and tooth brush and dentifrice 
selected, were obtained by examination of 
students’ teeth, tooth brushes and dentifrices, 


[Vol. 13, No. 3 


both at the beginning and at the end of the 
four month experimental period. 

Data describing the departmental contribu- 
tions to dental health education were obtained 
from reports provided by principals, guidance 
directors and teachers of each study school. 
Finally, evidence of teacher participation in 
organized dental health education courses and 
in individual follow-up was obtained from 
observational recordings made by the investi- 
gator and approved by the teachers observed. 

Validation of evaluative criteria —The reli- 
ability and validity of the written test was 
established on the basis of authoritative judg- 
ments and by the results of a control study. 
Statistical evidence of the reliability of this 
instrument was obtained by reliability coeffi- 
cients of +.74 and +.73 obtained between 
test form scores of tested groups of 727 and 
143 eighth graders, respectively. 

The objectivity of the dental health be- 
havior criteria, as defined and the validity of 
dental examinations and recordings made by 
a dental health educator, were established by 
the results of a control study conducted by a 
dentist and a dental health educator. Statis- 
tical analysis of the recordings made by the 
two examiners, provided correlation coeffi- 
cients ranging from +.88 for tooth brush 
evaluations to +1.00 for recordings of teeth 
lost, teeth to be extracted and dentifrice eval- 
uations. The magnitude of these correlation 
coefficients indicated that the criteria of 
dental health behavior, as defined, were highly 
objective and that dental examinations made 
by a dental health educator, using mirror and 
explorer, were highly valid. 


RESULTS OF INVESTIGATION 


Major findings.—It is noteworthy that the 
original group of 727 students selected for 
study represented an average level of schoo! 
achievement when measured against the score 
range, and median and quartile eighth grade 
norms established for this test. In view of this 
finding. it is necessary to consider the results 
obtaine from this investigation as indicative 
of the vehavior of adolescents of average 
school achievement. 

Knowledge and judgment .—In applying the 
criterion of mean knowledge and judgment 
increases to the thesis of this study it was 
found that experimental groups, exposed to 
the influence of Full Teacher Participation in 
dental programs, significantly surpassed con- 
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trol groups exposed to Slight and No Teacher 
Participation in such programs. 

This was evidenced by the results presented 
in Table I which indicate that experimental 
Group A exceeded control Group B in mean 
knowledge increase by a critical ratio of 5.0. 
Likewise, experimental Group EF exceeded 
control Group GH in mean knowledge in- 
crease by a critical ratio of 2.7. 


TABLE I 
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Group F surpassed control Group G by a crit- 
ical ratio of 0.6 and this group in turn sur- 
passed control Group H by a ratio of 0.4. 
However, it is noteworthy that the knowledge 
gains exhibited by the groups consistently in- 
creased as the time devoted to class instruc- 
tion increased. 

Dental health behavior—tIn applying the 
criteria of dental health behavior to the thesis 


EXPERIMENTAL Group A COMPARED TO CONTROL Group B IN TERMS OF SCORE CHANGES 


Initial Scores Increases 
Groups 
Means Sigmas Means Sigmas 

112.8 12.5 11.5 10.0 
+.8 +.6 
Significance of Differences - D-= .7 D- .5 D =4.2 D =0.0 

SEp -1.1 SEp = .6 SEp = .85 
D/SEp = .6 D/SEp = .8 D/SEp =5.0 


Further application of the criterion of mean 
knowledge increases indicated that a Full 
Teacher Participation program, with nine 
annual hours of organized dental health edu- 
cation, supplementing “‘incidental’”’ instruction 
was not Significantly superior to a Full 
Teacher Participation program with four and 
one-half annual hours of similar education, 
and this program in turn was not significantly 
superior to a Slight Teacher Participation pro- 
gram with three annual hours of dental health 
education administered sporadically, nor was 
this program significantly superior to a No 
Teacher Participation program providing only 
“incidental” departmental dental health edu- 
cation. These findings were evidenced by the 
fact that experimental Group E surpassed 
experimental Group F by a critical ratio of 
1.5 in mean knowledge increase; experimental 


TABLE II 


EXPERIMENTAL Group A AND CONTROL Group B COMPARED For TEETH FILLED DuRING Four 
MoNTH EXPERIMENTAL PERIOD 


of this study it was found that the group ex- 
posed to Full Teacher Participation programs 
Significantly and consistently surpassed the 
groups exposed to Slight and No Teacher Par- 
ticipation programs during the four month 
experimental period. 

This was evidenced by the results presented 
in Tables II, III and IY which indicate that 
experimental Group A _ exceeded control 
Group B by a critical ratio of 10.8 in percent 
students obtaining fillings, by a ratio of 8.0 
in mean number of teeth filled, and by a crit- 
ical ratio of 2.9 in the degree to which stu- 
dents obtained fillings in relation to their 
actual needs. Furthermore, experimental 
Group A surpassed control Group B by a crit- 
ical ratio of 12.0 in percent students who ex- 
hibited improved mouth hygiene, by a critical 
ratio of 2.6 in percent students selecting 


% Studentswho Mean number’ Correlation of 
Groups had one or more of teeth teeth caries and 
teeth filled filled teeth filled 
Significance of Differences_____.._...._..----- D =43.2 D =3.2 D= .32 
’ SEp = 4.0 SEp = .4 SEp = .11 
D/SEp =10.8 D/SEp <=8.0 D/SEp = 2.9 
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TABLE III 


EXPERIMENTAL GrouP A AND CONTROL GrouP B CoMPARED For CHANGES IN MOUTH Hycieny 
PRACTICE AS EVIDENCED BY NEED For DENTAL PROPHYLACTIC TREATMENT 
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Percent students needing dental prophylaxis 


Percent students selecting approved tooth brushes and dentifrices 


EXPERIMENTAL Group A AND CoNTROL GrouP B CoMPARED For SELECTION OF APPROVED Toots 
BRUSHES AND DENTIFRICES 


At Initial At Final 


SEp = .35 


Ep = 3.9 
D/SEp -14.6 


D/SEp =12.3 


Groi 
At Initial Exam 
T. Br. Dent. . 


44.0 . 12.9 


11.0 


D =4.5 D =1.9 
SEp =4.1 SEp =3.0 


39.5 


Significance of 
ifferences 


approved tooth brushes and by a critical ratio 
of 5.6 in percent students selecting approved 
dentrifices. Similarly, experimental Group EF 
surpassed control Group GH on each criterion 
of dental health behavior tested. _, 
Although the criteria of dental health be- 
havior served to differentiate significantly the 
experimental Full Teacher Participation pro- 
grams from the control Slight or No Teacher 
Participation programs, it is noteworthy that 
application of these selected criteria did not 
serve to distinguish significantly between the 
two experimental Full Teacher Participation 
programs, nor did it differentiate the control 
Slight Teacher Participation program from 
the control No Teacher Participation program. 
Thus it was found that the correlated experi- 
mental Groups E and F did not differ signifi- 
cantly in percent students who obtained 
fillings, in the mean number of teeth filled, in 
the degree to which, fillings obtained were 
correlated to fillings needed, mor did they 
differ significantly at the end of the experi- 
mental period in percent students exhibiting 
improved mouth hygiene and in percent stu- 
dents selecting approved tooth brushes and 
dentrifices. These findings indicate that the 


T. Br. 


D/SEp =1.0 D/SEp = .6 D/SEp =2.6 D/SEp =5.6 D/SEp =1.5 D/SEp =6.0 


Increases 
7 Dent. 


34.7 32.1 


At Final Exam 
Dent. 


45.0 


78.7 


22.0 


D =23.0 
SEp =4.1 


29.5 11.0 


D=6.2 D =21.1 
SEp =4.1 SEp =3.5 


69.0 
D =9.7 
SEp =3.8 


two Full Teacher Participation programs, 
although differing in amount of time devoted 
to class dental health education and exhibiting 
different patterns of follow-up, were not dem- 
onstrably different in their effects upon stu- 
dent dental health behavior, as defined. Like- 
wise, when the two correlated control groups 
G and H, representing the Slight and No 
Teacher Participation programs, respectively, 
were compared for changes in dental health 
behavior, it was found that they did not differ 
Significantly from each other on any of the 
dental health behavior criteria tested. 

Since these control programs differed in 
amount of time devoted to class dental health 
education but were alike in the respect that 
neither one provided for teacher participation 
in individual follow-up, it may be inferred 
that dental health behavior is more directly 
influenced by individual follow-up than by 
class education. Furthermore, the evidence 
presented, indicating that the two Full 
Teacher Participation programs had approxi- 
mately equivalent effects upon dental health 
behavior of correlated Groups (E and F), and 
that the Slight and No Teacher Participation 
programs were approximately equivalent in 


= 
; 
Groups 
Exam Exam. Decrease 
72.1 79.2 —7.1 
Significance of Differences D =3.1 48 5 
SEp =<3.9 
D/SEp = .8 
TABLE IV | 
A 
B 
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their effects, during the same period, upon 
dental health behavior of correlated groups 
(G and H), and still further, the fact that 
the combined Full Teacher Participation 
Group EF exhibited a significantly greater 
improvement in dental health behavior than 
did the combined Slight and No Teacher Par- 
ticipation Group GH, serve to strengthen the 
inference drawn that dental health behavior is 
more readily influenced by teacher participa- 
tion in individual follow-up than by class 
instruction. 

Carry-over values——When the carry-over 
values of the different degrees of teacher par- 
ticipation were tested by the criteria of dental 
health behavior, it was found that experimen- 
tal Group C (which had been exposed to a 
Full Teacher Participation program) consist- 
ently surpassed control Group D (which had 
been exposed to a Slight Teacher Participation 
program) with respect to percent students 
who obtained fillings, percent students who 
practiced mouth hygiene and percent students 
who selected approved tooth brushes and 
dentifrices. These findings indicate that Full 
Teacher Participation in a dental program has 
a significantly greater carry-over effect upon 
student dental health behavior than has Slight 
Teacher Participation. However, it should be 
noted that Slight Teacher Participation may 
also have had some carry-over effect as evi- 
denced by the fact that 82.5% of Group D 
needed fillings as compared with the expected 
norm of 87% for the same age groups of low 
economic level in New Haven. 

Diet—In applying dietary criteria to the 
thesis of this study, it was found that these 
criteria were unrelated to the various degrees 
of teacher participation in the dental health 
education programs. In fact, the results ob- 
tained indicated that neither the experimental 
group nor the control group was significantly 
influenced in dietary patterns over the four 
month period. These findings are not surpris- 
ing when the complex nature of the factors 
that influence dietary patterns are taken into 
consideration.** 

The dietary data, obtained in this study, 
were useful in disclosing the strengths and 
weaknesses of the diets of the students. Anal- 
ysis of these data indicated that the students’ 
diets were deficient in milk, ,butter, citrous 
fruits and green and yellow vegetables. Similar 


*G. Nizzardini and N. F. Joffe, Italian Food Patterns and 
Their Relationship to Wartime Problems of Food and Nutri- 
tion, (Washington, D. C.: National Research Council, 1943). 
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findings have been reported by Wilson*® and 
Morgan.” 

Subsidiary findings—In addition to the 
major findings reported, the study yielded 
some subsidiary findings. 

Dental examinations.—Analysis of time 
schedules of the dental health education pro- 
grams, conducted by the dental health edu- 
cator in the study schools, indicated that the 
dental health educator could observe and 
record the dental needs of 43 adolescents in a 
five hour day. On the other hand, when mak- 
ing examinations for corrections obtained, the 
dentat health educator was able to handle 60 
adolescents in a five hour day. In view of the 
speed and validity with which a dental health 
educator can conduct these examinations, it 
seems educationally justifiable to include 
these procedures in a dental health education 
program. If time is not available for both 
examinations, however, it might be advisable 
to eliminate examinations for need and retain 
examinations for corrections.**. ** 

Conclusions.—In the light of the findings 
reported, it may be concluded that: 

Full Teacher Participation in dental health 
education programs, directed by a dental 
health educator, is more effective than Slight 
or No Teacher Participation in similar pro- 
grams, both with respect to increasing student 
knowledge and judgment and improving stu- 
dent dental health behavior. Furthermore Full 
Teacher Participation in a dental health edu- 
cation program has a significantly greater 
carry-over effect upon student dental health 
behavior than has Slight Teacher Participa- 
tion. In fact, a Slight Teacher Participation 
program, which provides brief sporadic peri- 
ods of instruction and no follow-up of student 
dental health is little more effective in influ- 
encing student dental health knowledge, judg- 
ment and behavior than is a No Teacher 
Participation program. 

Interpretations.—The conclusions presented 
leave small question concerning the desirabil- 


‘ ity of providing for teacher participation in 


class dental health education and in individual 


2 Charles C. Wilson, The Diets of Hartford School Children. 
(Hartford: Board of Education, July 1941) p. 10. 

*® Lucy yp - - An Evaluation of School Health Education 
in Secondary Schools in Tennessee in Terms oj @ Study of 
(Unpublished Dissertation, Yale University, 
New Haven, Conn., 1938) pp. 70-80. 

*1 Dental needs are more or less universal. 

® Ruth A. Frankel, An Evaluation of a Dental Health Edu- 
cational Program by Corrections Achieved. (Unpublished Doc- 
torate Dissertation, New York University, N. Y.: 1940) p. 140. 
Mrs. Frankel found that signed correction slips by community 
dentists were not reliable indices of corrections achieved. 
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follow-up of student dental health. However, 
careful consideration should be given to the 
questions of time to be devoted to class in- 


_ Struction, organization for the class instruc- 


tion and procedures to be used in follow-up. 

The data obtained in this investigation sug- 
gest that slight gains in group mean knowl- 
edge consistently accompany increased time 
devoted to class dental health education. 
Thus, nine annual hours of regular class peri- 
ods (supplementing departmental contribu- 
tions) produced somewhat greater gains than 
did four and one“half hours of similar educa- 
tion and this plan was slightly more effective 
on knowledge increase than three annual 
hours of sporadic instruction, which in turn 
was slightly more effective than no class in- 
struction beyond “incidental.” 

At first glance, these findings suggest that, 
of the four degrees of time allotment studied, 
the optimum time allotment for class dental 
health education at the eighth grade level is 
nine annual hours. It must be borne in mind, 
however, that the nine hour program and the 
four and one-half hour program were accom- 
panied by individual follow-up of student 
dental needs. The degree to which the follow- 
up procedures influenced knowledge increases 
was not determined in this investigation. Con- 
sequently, on the basis of the findings pre- 
sented, no clear cut recommendations can be 
«made concerning the amount of time to be 
devoted to class dental health education. How- 
ever, until information is available indicating 
the relative merits of class instruction and 
individual follow-up in affecting knowledge 
increases, it seems reasonable to determine the 
amount of time for class instruction in the 
light of student dental health knowledge as 
measured against the dental health knowledge 
norms established for the particular school. 
Thus, the time devoted to class dental health 
education will vary as each group varies in its 
ability to progress from the established test 
norms of one grade to the established test 
norms of the next grade. 

Closely related to the question of time allot- 
ment for class instruction is the problem of 
curriculum organization for this instruction. 
In other words, should regular class periods 
be devoted to this area of study or should this 
area be studied sporadically, in brief periods? 
To investigate this problem experimentally, 
it would be necessary to examine equated 
groups, one exposed to a specific number of 
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hours of sporadic instruction and the other 
exposed to the same number of hours of reg. 
ular class periods of instruction. Under con- 
ditions of the present investigation, this prob. 
lem could not be studied, because none of the 
study school programs provided regular class 
periods of dental health education, per se, i. 
without individual follow-up programs. In 
view of this limitation in the present study a 
clear cut recommendation cannot be made on 
this aspect of the dental program. However, 
considering the problem in the light of learn- 
ing theory, it is quite likely that average 
length class periods are better adapted to the 
interest span of adolescents than are brief 
periods, and that regular instruction provides 
better opportunity for continuity and integra- 
tion of learning than does sporadic instruction. 


Regarding the question of follow-up proce- 
dures, it was found that programs which pro- 
vide for frequent two minute individual con- 
ferences, conducted within the hearing of the 
class and in which marks are used as incen- 
tives for obtaining dental care, are just as 
effective in influencing dental health behavior, 
as are follow-up programs that provide for 
lengthier individual conferences, conducted 
privately and in which marks are not used as 
incentives for obtaining dental corrections. 
However, in view of the possibility that per- 
sonal interviews conducted publicly might 
have subtle undesirable effects upon other 
aspects of student behavior, it may be more 
justifiable educationally, to provide for pri- 
vately conducted personal interviews in a 
dental follow-up program. Furthermore, since 
student needs vary, it seems reasonable to 
expect that individual conferences adapted in 
time to the varying needs of individual stu- 
dents would be more desirable than brief con- 
ferences of constant length. Finally the prac- 
tice of employing marks as incentives for 
obtaining dental corrections merits careful 
consideration. If equal financial and temporal 
opportunity for obtaining dental care is pro- 
vided to all students, then the use of marks 
as incentives may not be harmful. However, 
the dental care situation calls not only for 
such equal opportunity but also calls for 
parent approval and co-operation. It is quite 
likely that some students will fail to obtain 
dental care because of lack of parent interest. 
In such cases, the mark has small value as an 
incentive, in fact, it becomes harmful for it 
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serves to punish a student for behavior other 
than his own. 

In the light of the interpretations presented 
above, it appears that dental health education 


| programs at the junior-high level, to be effec- 


tive in improving student knowledge, judg- 
ment and behavior, should include teacher 
participation in regular class periods of in- 
struction, possibly ranging from four and one- 
half to nine annual hours, according as stu- 
dent needs indicate. Furthermore, the pro- 
grams should provide for teacher participation 
in privately conducted individual follow-up 
conferences which vary in length according to 
student needs. These conferences should moti- 
vate students to obtain corrections through an 
understanding of personal need rather than 
for the reward of a good mark. 

Problems for further study.—Several prob- 
Jems touched upon in this investigation might 
well be considered for further study. 

First, ‘to check the results of our limited 
study of carry-over values of different degrees 
of teacher participation, the present major in- 
vestigation could be extended for a two or 
three year period. This would permit a follow- 
up of the dental health knowledge and be- 
havior changes of the study students in their 
progress through the ninth, tenth and even 
eleventh grades of school. 

Second, to determine the relative merits of 
regular class periods and brief sporadic periods 
of dental health education, equated groups 
might be studied for knowledge and behavior 
changes over specified periods. 

Third, equated groups, one exposed to class 
dental health education and the other exposed 
to follow-up for a specific period, might be 
examined for knowledge and behavior changes 
in an effort to determine whether class instruc- 
tion or individual follow-up is more directly 
related to knowledge and behavior changes. 
The determination of the degree of correlation 
between the knowledge increases and dental 
health behavior changes over a specified 
period might throw additional light on the 
values of class instruction and individual 
follow-up. 


APPENDIX 


Teacher versus nurse participation in dental 
follow-up.—Analysis of dental follow-up re- 
ports obtained from two teachers of the Full 
Teacher Participation programs and from a 
school nurse of one of the Slight Teacher Par- 
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ticipation programs disclosed that 93% of the 
students, exposed to teacher participation in 
follow-up obtained dental care during a six 
month period as compared to 40% of the 
students who were exposed to follow-up by 
the school nurse. These results suggest the 
possibility that a dental follow-up program in 
which teachers participate may be about two 
and one third times more effective in influ- 
encing student dental health behavior than a 
follow-up program carried on by a nurse with 
no participation by teachers.* 


DEFINITIONS OF RESEARCH TERMINOLOGY 


Constant (control factor) is a factor which 
presumably or demonstrably is capable of 
modifying results in a given investigation. 
Such factors must be controlled for experi- 
mental purposes. 

Control Group is one which in an experimental 
study is exposed to the constant treatment 
or conditions to which all groups involved 
are exposed, but is not exposed to the vari- 
able factor, i.e. the particular treatment or 
condition being investigated. The results 
obtained from control groups serve as a 
base against which to compare the results 
obtained from the experimental groups and 
thus supposedly indicate how muth of the 
gain or change produced in the experimental 
group may have resulted from the experi- 
mental method or procedure or condition. 


Coefficient of Correlation (r) This is a 
numerical measure indicating the degree 
(amount and direction) of relationship be- 
tween two or more variable factors. The 
relationship may be positive or negative and 
may range from +1.0 to —1.0. Positive 
correlation suggests that under given con- 
ditions the related variables may be ex- 
pected to exhibit relatively similar changes 
in the same direction; negative correlation 
suggests that under given conditions the 
related variables may be expected to exhibit 
relatively similar changes in opposite direc- 
tions. A coefficient of correlation equal to 
+-1.00 indicates perfect positive relation- 
ship. A coefficient of 0.00 indicates no rela- 
tionship; a coefficient of —1.00 indicates 
perfect negative relationship. 


38 Corrections obtained during the first four months of the 
six month period were determined by examinations made by 
the dental health educator, while corrections obtained during 
the last two months of the period were determined by means 
, a the number of si correction slips from community 

tists. 
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It is important to note that the relation- 

. Ship between variables is not causal, i.e. the 
factors being correlated do not cause each 
other to change, rather, the variables may 
changes as the given conditions are altered. 


Correlated Groups are groups that are pre- 
cisely equated in such a manner that each 
student of one group is matched to a stu- 
dent of the other groups on controlled 
factors or traits. 


Criterion is a standard or measure against 
which other measures are validated (i.e. 
demonstrated as true measures). A criterion 
must be valid (true) consequently, it is 
often necessary to validate (demonstrate 
the truthfulness of) the criterion before 
using it as a standard. 


Critical Ratio is the quotient obtained by 
dividing the difference between two meas- 
ures by the standard error of this difference. 
A ratio of 1.0 indicates that the chances are 
about 68 in roo that the difference is too 
great to be the result of sampling fluctua- 
tions, a ratio of 2.0 indicates 95 chances in 
100 that the difference is too great to be the 
result of sampling fluctuations and a ratio 
of 3.0 or more indicates practical certainty 
that the difference is too great to be the 
result of sampling fluctuations. 

Experimental Group is one which in an experi- 
mental study is exposed not only to the con- 
stant or controlled factors or conditions of 
the experiment, but also to the variable 
factor or condition being investigated. 


Mean is a measure of central tendency 
obtained by summing the separate scores 
and dividing the sum by the number of 
scores involved. 

Median is the midscore or midpoint in a 
series. 

Objective describes a criterion, index, instru- 
ment or any condition that permits precise 
quantitative analysis, i.e. measurement free 
from personal feelings or prejudice. 

Reliability is the degree to which an instru- 
ment consistently measures whatever it 
measures. 

Sigma or Standard Deviation (c) is the most 
reliable measure of variability. In a normal 
curve of distribution one sigma from the 
mean, taken in both directions includes 
about 68% or the middle two thirds of the 
cases; two sigmas <n each side of the mean 


will include 95% of the middle cases and 
three sigma lengths will account for prac- 
tically all the cases in the distribution. A 
large sigma indicates heterogeneity, a small 
sigma indicates homogeneity. 

Significant (statistically) indicates practical 
certainty that the difference between two 
measures is too great to be attributed to 
the fluctuations of sampling. In educational 
research a critical ratio of 3.0 is usually 
defined as indicating statistical significance. 
However, there is no reason why signifi- 
cance should be thought to appear with 3.0 
when none existed at 2.9. As a matter of 
fact some investigators think of results as 
carrying considerable weight when the ratio 
is 1.9 and quite convincing when the ratio 
is 2.6. 


Validity is the degree of accuracy or truthful- 
ness with which an instrument or criterion 
measure that which it is designed to 
measure. Consequently it is necessary to 
know what specific factors are being 
measured before the validity of the instru- 
ment or criterion can be determined. Valid- 
ity is not a general concept, it is a specific 
concept. Therefore an instrument cannot be 
described simply as valid, but must be 
described as valid for a specific purpose. 

Variable is a trait or quality existing in dif- 
ferent amounts in different individuals. 
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By the application of the “principle of like- 
lihood” Neyman and Pearson (References 
[4] to [8]) have obtained criteria suitable for 
testing statistical hypotheses of a broad vari- 
ety. This principle has been found to link 
together a large number of tests previously in 
use—such as, that used by Fisher for the com- 
parison of two estimates of variance [1] and 
several tests for the significance of the differ- 
ence between means [5] of samples from pop- 
ulations in which the members are distributed 
in accordance with the Gaussian law. The pro- 
totype of the principle of likelihood is one’s 
conviction that the degree of confidence placed 
in a hypothesis depends upon the relative 
probability of alternative hypotheses." 

Due consideration of this point of view 
leads one to realize that a criterion used for 
the purpose of quantifying this confidence 
should decrease as the probability of alterna- 
tive hypotheses becomes relatively greater. 
Because of certain a’priori limitations on some 
particular alternatives which cannot be ex- 
pressed in exact terms, it is impossible, in 
practice to scale the confidence with which one 
forms a judgment by the use of any single 
numerical criterion. Although it is impossible 
here, as in all applied mathematics, to bring 
the real situation in agreement with the ideal, 
some form of numerical measure is useful as a 
guide and a control. Other forms of criteria 
are certainly possible but the one proposed by 
Neyman and Pearson not only connects vari- 
ous tests already in use but also can be and 
has been extended to the solution of many 
new problems. 

Let x,, *, --- X, represent a sample of n 
observations. Assuming that these data con- 
stitute a random sample of some population, 
the statistician is concerned about specifying 
the probability law of the latter. Any assump- 
tion which he makes regarding this law is con- 
sidered a statistical hypothesis. Having ad- 
vanced the latter, the statistician is next con- 
cerned about a means for testing whether or 


*The manner of defining the probability of alternative 
will be given later. (p. 4) 
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not it is reasonable to expect that the observed 
sample would arise if the proposed hypothesis 
were, in fact, true.*? Obviously, any test based 
on the theory of probability will by itself pro- 
vide no valuable evidence as to the truth or 
falsehood of a hypothesis in any particular 
case. Nevertheless, without hope of knowing 
whether or not each separate set of assump- 
tions actually obtain, we may still expect to 
be able to set up a rule of behavior which, if 
followed, would insure that in the long run of 
experience ve would not, too often, be wrong. 
Such a rule, based on the principle of likeli- 
hood, was developed by Neyman and Pearson. 
Before outlining the steps to be followed in 
setting up this rule in an illustrative case, we 
shall define two terms to be used in the dis- 
cussion. 

The term simple hypothesis is used to desig- 
nate a set of assumptions which specify com- 
pletely the probability law governing the 
population of which a sample has been 
observed. For example, the normal probability 
law, 


p(x) —=—— e — 
20 


represents that which might hold for an infi- 
nite set of populations. This clearly does not 
specify the population completely because it 
contains two unspecified parameters @ and o. 
The hypothesis which specifies that the ob- 
served data comprised a sample from that 
particular member of this set in which a = m 
and o = a,, is an example of a simple hypoth- 
esis. In contrast to such a set of assumptions, 
a composite hypothesis is one that specifies the 
functional form of the probability law but 
does not specify one or more of the parame- 
ters involved. Thus, the assumption that the 
observed data comprise a sample of a popula- 
tion obeying (1) is considered a composite 
hypothesis with two degrees of freedom be- 
cause there are two unspecified parameters, 
a and o. If we assume that the observational 


2A hypothesis is said to be true if its specifications con- 
cerning the probability law of the population are ‘the ones 
which actually do obtain. 


‘ 
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data comprise a sample of any one of the sub- 
set of populations characterized by the prob- 
ability law (1) and also having a == m, we 
speak of such a set of assumptions as consti- 
tuting a composite hypothesis of one degree 
of freedom. 

Neyman and Pearson [6] point out the 
fact that in common statistical practice, when 
one was concerned with testing whether or not 
an observed sample had been drawn from a 
certain specified one or class of populations, 
the criteria used for such a test were usually 
some functions of the moment coefficients of 
the sample. As long as the variation among the 
observations is approximately represented by 
the normal frequency law, moments appear to 
be the most appropriate sample measures to 
be used. However, there is still considerable 
choice it: the particular function of these 
moments that is most appropriate to test a 
given hypothesis. There is ample evidence in 
the literature which illustrates considerable 
confusion over this point. As a consequence 
of this sort of confusion, it seems necessary 
that the ideas involved in the testing of 
hypotheses be more clearly understood. 

As mentioned previously the “principle of 
likelihood” has been proposed and developed 
by Neyman and Pearson [4] as a means for 
determining appropriate criteria for testing a 
broad class of statistical hypotheses. In many 
problems confronting the investigator, the 
hypotheses to be tested concern different pop- 
ulations from which the observed sample may 
have been drawn. Therefore, instead of speak- 
ing about sets of simple hypotheses it often 
is convenient to speak of sets of populations. 
In order to explain the operation of the 
“principle of likelihood,” let us suppose we 
have the following set of data: x,, x, --- xp. 
Let 4, be the simple hypothesis to be tested 
so that p(x,, x, --- x,/ h,)* is completely 
specified. Let T represent the whole set of 
simple hypotheses (4,, ) any one 
of which we are prepared to accept alternative 
to A4..* One or more of these 4,’s is of such 
a nature that p(x,, x, ... x,/ h’;) is greater 
than the probability that the observed sample 
occurred under the circumstances specified by 
any other hypothesis belonging to the set 7. 
Designate p(x,, x,, --. %,/ A’) for that par- 

%, --. x,/ &,) represents the probability that the 
Sader the csoumption thet the condl- 


tions specified by the hypothesis actually hold 
* This wa is said to be composed of other admissible 
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ticular 4, as p(T-max) and p(x,, x., --- 
h,) as p.. Under these circumstances the ratio 


is designated as the likelihood of hypothesis 
h, as tested against the class of admissible 
simple hypotheses, 7. In other words, lambda, 
A, is the ratio of the likelihood of the hypothe- 
sis tested, A4,, to the likelihood of that hypo- 
thesis of an admissible set, 7, (of which 4, is 
a member) which gives the maximum prob- 
ability for the observed sample. This quantity 
A is used as a criterion for setting up our rule 
of behavior regarding the rejection or non- 
rejection of hypothesis, #,. Before discussing 
the critical level for A, we shall explain how A 
is derived in the case of composite hypotheses. 

Let H represent a composite hypothesis in- 
cluding the set of simple hypotheses, 4, 4,, 
h,, h, --- . Designate the set of probabilities 
that the observed sample occurred subject to 
the assumptions specified in H, by the expres- 
sion p(x,, X,, --- %,/H). Let H, represent a 
composite hypothesis composed of a sub-set 
of the simple hypotheses included in the set H. 
In order to determine the value of A appro- 
priate for testing the composite hypothesis H,, 
it is necessary to determine the upper bound 
of p(x,, %,, --- %./H.) which is designated by 
p(H.-max). If the set of hypotheses H con- 
tains all those admissible, H, and the alterna- 
tives, the upper bound of p(x,, --- %»/H) 
is determined and may be designated by 
~(H-max). Then the likelihood criterion for 
testing H, against the set of alternatives in- 
cluded in the set H, is 


In most cases met in practice, the probabil- 
ities corresponding to the different simple 
hypotheses of the set H are continuous and 
differentiable functions of a certain number, s, 


parameters 6,, --- 6,, --- 95. Let 
P(4,, 6., 6,, 6, +19 9s, %m)-- 


(4) designate such a function of these s para- 
meters with the values of the x’s fixed as deter- 
mined by observation. Under these conditions 
p(H-max) or p(T-max) is often the maximum ° 
value of (4) with respect to all possible sys- 
tems of the 6’s, and the fixed values of the 
x’s. If H, is a composite hypothesis with (s-r) 
degrees of freedom, it specifies r parameters 
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such as 6,, @,, _.. 4, and leaves the others un- 
specified. Then p(H,-max) is often the maxi- 
mum of (4) with respect to all possible sys- 
tems of 6, + ,,0;+ » 6, and with the fixed 
values of the x’s and @,, 

Having thus indicated the manner in which 
the criterion, A, is determined, we must now 
turn our attention to a means for finding the 
critical value of A against which we compare 
the value of the criterion found in any par- 
ticular case. If for any set of observations A is 
small, this indicates that the set of admissible 
hypotheses contains some which, in the light 
of the observed data, are more probable than, 
H., the hypothesis tested. Under such circum- 
stances’ we are inclined to reject the latter. 
Suppose, for example, that the investigator 
decided that it would do no great harm if, in 
the long run, he would not reject H,° unjustly 
more than once in a hundred times. This arbi- 
trarily chosen level of significance, «, aids in 
the determination of the critical valye for A, 
which may be designated as A,. Let the prob- 
ability, determined by the hypothesis H,, that 
the observations will give us a value of A 3 
Ao," be designated as 


p{(A SA) } 
The critical value, A,, is determined so that 
p (A 


If we thus determine A, and adopt as our rule 
of behavior: “reject H, when A = A,,” the 
probability of rejecting H, when it is, in fact, 
true is 


P = ¢(H,) p (A /H, 


which is equal to or less than « if @ (H,) is 
taken as the unknown a@’priori probability of 
the hypothesis H,. 

Although it is not possible from observation 
of a sample or even of a larger number of 
samples to determine the population from 
which they were drawn, an approach to this 
objective may be made by using some clearly 
defined conception of probability to determine 
a “probable” or “likely” form of the popula- 
tion sampled. Without claiming that the 
method based on the principle of likelihood is 
necessarily the best to adopt, at any rate, it 

‘If an observed sample actually had been drawn from the 
population specified by AH, and the investigator had decided 
that A was too small for non-rejection of H,,; we would con- 
sider this as unjustly rejecting the hypothesis tested. 

*\,, may be any number between zero and unity. 
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does provide a procedure for discriminating 
between populations more likely to have been 
sampled and observed in any particular in- 
stance from those less likely. This is accom- 
plished by considering the likelihood of hypo- 
theses alternative to the one tested. In making 
a decision in regard to whether any hypothesis 
such as H, does provide objectionable assump- 
tions concerning the specification of the pop- 
ulation from which an observed sample has 
been drawn, there are two sources of errors: 
namely, (1) H, may be rejected when the 
population sampled is accurately specified by 
H, or (II) H, may not be rejected when the 
population actually sampled is not the one 
specified by H, but is characterized precisely 
by some alternative hypothesis such as H,. 
In the paragraphs immediately preceding this 
one, consideration was made of this first 
source of error and, finally, the probability of 
the occurrence of errors of type (I) was given 
by (6). The coefficient, «, furnishes only an 
upper bound for (6) in most practical prob- 
lems because ¢ (H,) can rarely be specified 
in quantitative terms. [7] Although this situ- 
ation is not ideal, a consideration of the con- 
sequences of errors of type (I) reveals that 
knowledge of this upper limit is not as inade- 
quate as it may appear. If H, is rejected when 
it is the correct characterization of the popu- 
lation sampled, the route to our goal, that is, 
the proper specification of the latter, is thus 
closed for the time being. This is true regard- 
less of the hypothesis tested so that errors of 
this type are often described as “equivalent.” 
However, in failing to reject H, when some 
other hypothesis such as H, correctly specifies 
the population sampled, the consequences of 
this type of error depend on the difference 
between H, and H,. Thus, if H, is not greatly 
at variance with H,, there would often be neg- 
ligible, if any, serious consequences of failure 
to reject H, while the eravity of the result 
would steadily increase as the difference be- 
tween H, and H, increases. It follows, then, 
that a consideration of the consequences of 
the two types of errors leads us to conclude 
that it is the magnitude of errors of type (I) 
that matters, while it is what-may be termed 
the quality of errors of type (II) that must 
be considered. 

The probability of rejecting H, when an 
alternative H, is true has been termed the 
“power of the test” with regard to H;. [8] 
Hence, the difference between unity and the 
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power of the test gives the probability of 
errors of the type (II). This function of the 
analysis of variance tests has been treated by 
Tang. [10] Thus decisions made on the basis 
of the maximum likelihood ratio (3) eliminate 
type (II) errors with increasing stringency as 
H, differs more and more from H,. 

With the foregoing discussion of the essen- 
tial elements of the Neyman—Pearson use of 
the principle of likelihood in the development 
of criteria for testing statistical hypotheses, 
it is now possible to present the general 
formulas for the purpose of illustrating spe- 
cifically how such criteria are determined for 
a number of tests. 

(a) A test of the significance of the mean 
of a sample drawn from a population in which 
the distribution is normal.—Let x,, x,, --- X» 
represent the m observational values drawn 
from a population in which the law of distri- 
bution is assumed to be of the normal type 
with the standard deviation «. The probability 
of the simultaneous occurrence of the » such 
observations is 


I 
{ (x,, = (= —) 


where capital sigma denotes the sum for all ” 
observations, @ represents a constant, and A 
denotes the whole class, 7, of simple hypothe- 
ses any one of which we are prepared to accept 
alternative to 4, the one we wish to test. This 


p, --- Xp) /he } -(- van 


latter hypothesis may be designated symbol- 


In order to determine the value of p(7T-max) 
as discussed on page 5 above, it is necessary 
to find the maximum value of (7). This is 
most easily accomplished by determining the 
minimum of the absolute value of logarithm 
of (7). 


log p — n log o — n log \/ 24 — 


In order to determine the values of o and a 
which when substituted in (7) will maximize 
that quantity, we consider for the moment 
that o and a are continuous variables while 
nm and the sum of the x’s are fixed constants. 
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Then we differentiate (9) partially with re- 
spect to o and a, set these derivatives equal to 
zero and solve the resulting system of equa- 
tions for o and a. The derivatives are 
Slogp =m &(x—a)? 
slogp %(x—a) 
- 
When these are set equal to zero, the follow- 
ing system of equations is obtained 
no* (x—a)? | 
= (x—a) =0 


Hence 


If x is used to represent the right member of 
(13) then 
no? =X (x — x)? 
Denote the right member of (14) by n S,?. 
The next step in the process of obtaining 
the appropriate criterion for testing the sig- 
nificance of the mean of such a sample, is to 


2 
a) 


consider the modification in (7) associated 
with the presence of the value zero for a, as 
proposed by the hypothesis, 4,. The probabil- 
ity of the occurrence of the observed sample 
subject to the hypothesis, 4,, is 


hence the logarithm of , is 
log p, — n log o, — log \/ 22 — 
(16) 


20,” 


Since m and & x? are fixed by the particular 
observations, the only variable in (16) is o,. 
Considering o, as a continuous variable, we 
shall determine the value which it must 
assume to render (15) a maximum by differ- 
entiating (16) and setting this derivative 


When this derivative is set equal to zero, we 
have 


en 
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8 o, a” 
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Denote the right member of (18) by S,”’. 

If we substitute in (7) the values of o and 
a determined in (13) and (14) which will 
render (7) a maximum, we obtain 


I a n 
Similarly, by substituting the value of o, 
determined in (18) which will render (15) 


a maximum, we obtain the quantity desig- 
nated by p, on page 5. Thus 


I 


Hence the ratio (2), in this instance, is 


In practice, it has become customary to use 
U as a criterion instead of A, where 


2 
22 
(22) 


Among others, Kolodziejezyk [3] has shown 
that under the assumption that random sam- 
ples are chosen from the same normal popu- 
lation, this ratio (U) is distributed as the 
incomplete beta function. Since the integral 
of this function has been tabled [9], it is 
possible to determine the value of U corre- 
sponding to the particular probability level 
(«) chosen as the risk we are willing to take 
of rejecting a hypothesis that correctly spe- 
cifies the parameters of the populations sam- 
pled. Johnson and Neyman [2] have pub- 
lished an extension of Pearson’s table [9] 
which includes the values required for (m — s) 
greater than 100 where represents the num- 
ber of observations and s the number of inde- 
pendent parameters required for specifying 
the class of admissible hypotheses. 

In the case of this test of the significance 
of the mean of a sample, the investigator 
would calculate the value of (22) by comput- 
ing the ratio of = (x — x)* to & x*. Then he 
would enter Pearson’s table [9] if » were less 
than 1o1 or Johnson and Neyman’s [2] if n 
were ror or greater. If he found the observed 
value of (22) was greater than the one given 


U=az or U= 
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in the appropriate table corresponding to the 
probability level («), he would accept the null 
hypothesis (4). It is clear that (22) would 
be equal to unity if the mean of the sample 
(x) were zero and would decrease as the mean 
would become numerically greater. 


(6) The likelihood test of the significance 
of the difference between the means of two 
samples drawn from normal populations with 
equal though unknown standard deviations — 
Let --- and + 1» + 2 --- 
Xn; + ng Tepresent the observational values in 
two samples drawn from two normal popula- 
tions, P, and P, respectively. Let o represent 
the unknown common standard deviation of 
the normal distribution for these populations, 
The probability of the occurrence of (, +1.) 
such observations subject to the hypotheses 
represented by H, is 


I 
(== 


+ 


where 3, and %, denote the sums for , and 
n, observations respectively, @, and a, repre- 
sent constants and H denotes the whole class, 
T, of simple hypotheses any one of which we 
are prepared to accept alternative to H.,, the 
one we wish to test. This latter may be desig- 
nated as 


In order to determine p (7-max) it is neces- 
sary to find the maximum value of (23). Here 
again, we minimize the numerical value of the 


logarithm of (23). 


log p (m, + n-) (n, log 
V 29 — =, (x — a,) 2 (x —@,) _.(25) 


For the moment, we consider o, a, and a, as 
continuous variables and all other quantities 
in the right member of (25) as constants. 


Slogp 

=, (x—a,)? (26) 
Slogp (27) 


$a, 


18) 
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a, = 24, 
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Slogp _ >, (x — 2.) based on the principle of likelihood. The prob- 
(28) ability of the occurrence of the observed 
? samples subject to the hypothesis, H,, is 


+ +2 ( 
1 
The logarithm of is 
log 2, = — + m,) log — +m) log Vx (55) 


1 1 


If each of these partial derivatives is set equal Slog (x—a) + (35) 
to zero, we obtain the following values of ¢,, (35 
a, and o which maximize (23). 


When these partial derivatives are each set 


=, 
— (29) equal zero, equations (36) and (37) are 
obtained. 


Denote the right hand members of (29) and 
(30) by x, and x, respectively, the means of ‘Thus the value which @ must have to maxi- 
observed samples. be mize (32) is the arithmetic mean of all 
(n, + n,) = 3%, (x —x,)? (m, + m,) observations which may be de- 


+ 3, (x —%,)? ------- (31) noted by Xo. 


(=, + == (s—%,)* + 3, (8 —%)* 


Let (nm, + m,) S,? be used to designate the Let (m, + m,) S,* be used to designate the 
right member of (31) so that S,? is the value right member of (37). Then the value of A, 
o* must have in (23) in order that this p may appropriate for testing the significance of the 
have its maximum value. difference between the means of two samples 

As in the previous example, we must now drawn from normal populations with the same 
consider the modification in (23) that is asso- though unknown variance, is found by taking 
ciated with the presence of a, = a, =a as_ the ratio of the maximum values of (23) and 
proposed by the hypothesis, H,, for which we (32). The former is found by substituting 
are determining the appropriate criterion (31) in (23). This gives 


S. V2" 


Substituting (37) in (32) gives 


(Hy-max) = ( 
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(n, + 
(41) 


It is to be noted that (m, + n,) S,? is sum 
of the squares of the deviations from the 
sample means while (m, + ,) S,* is sum of 
the squares of the deviations from common 
mean of the two samples. If the sample means 
are not greatly divergent, the value of U given 
in (41) is near to unity and would reach that 
limit when the sample means were identical. 

(c) The likelihood test of the significance 
of the difference among the means of several 
samples drawn from normal populations with 
equal though unknown standard deviations. — 
The derivation of the appropriate criterion for 
this test would follow step by step the work 
shown in section (b) above. If N would be 
used to represent the total number of observa- 
tions in the several samples 


N S.2 = %, (x — 2,)* + 3, (ex —2,)? + -- 


(2,, 2, --- |= 


where X», X, and represent the means of 
the respective samples. Also 


NS?=3, (t —%)* + 3, + -- 


where x, represents the common mean of all 
N observational values contained in the 
several samples. The reader who is familiar 
with the method of “analysis of variance” will 
recognize its similarity to the one given here. 
The quantities and (S Se’) are 
* those which Fisher [1] refers to as two inde- 
pendent estimates of the common variance. 
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(The quantity 7 in the above is the number of 
independent parameters specified by the hypo- 
thesis tested.) 

(d) The likelihood criterion appropriate for 
testing the significance of a correlation or 
regression coe ficient—Let x and y represent 
the variables whose correlation is to be tested. 
To simplify the formulas, assume that x and y 
are already in the form of deviations from 
their respective sample means. The following 
equation will aid in translating this problem 
into algebraic symbols. 


where 6 is a constant to be determined and 
z denotes that portion of y not linearly asso- 
ciated with x. It is these quantities (2) which 
the investigator is willing to assume are dis- 
tributed normally in the population from 
which the observed sample was drawn. Then 
the probability of the occurrence of the sam- 
ple, subject to the assumption that 6 and « 
have some specified values is 


where n denotes the number in the observed 
sample, capital sigma indicates the sum for all 
n observations and H denotes the whole class 
(T) of simple hypotheses any one of which 
we are willing to accept alternative to H”’, the 
one we wish to test. The latter in its null form 
would be stated thus: there is no significant 
linear association between x and y in the pop- 
ulation from which the observed sample was 
drawn in a random manner. In terms of the 
above symbols, this hypothesis may be stated 


The logarithm of (45) is 


log p =m log —n log — 
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As before, to determine the values of 6 and a 
which maximize (45) we proceed as in sec- 
tions (a) and (b) above. 
> (y—b x)? 
og 
$ log p 
8b 


When these are set equal to zero, we obtain 
the following system of equations 


Or ban (50) 


orn’? = 


— (sr) 


This quantity in the right member of (51) 
is recognized as the sum of the squares of the 
residual deviations from the regression line 
which may be denoted by n S,’. 

Next, we must consider the modification in 
(45) associated with the presence of the value 
zero for 6 as proposed by H”, the hypothesis 
to be tested. 


Here, the logarithm of p” is: 
log p” = — n log o” — n log Var 


The only variable involved in (53) is 0” so 
we must determine the value it must have to 
render p” a maximum. 


(s4) 
(55) 


Denote the quantity in the right member of 
(55) by » S,*. Hence from (51) S,? == 0? and 
from (55) S, 2 = o””*, Substituting these 
values in (45) and (52) respectively, gives 


the maximum values of these two functions. 


(H-max) = =) 


S, 


{(,, --- = (==) 
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=) * 57) 


Hence the appropriate criterion for testing 
the null hypothesis, H” is 
«(s) 
or 


(H”-max) = ( 


(H”-max) (H”- max) 


A== "> (H-maz) 


Here n S,? is the sum of the squares of the 
residual deviations from the regression line 
and n S,* is the sum of the squares of the 
deviations from the sample mean of the 
dependent variable, y 


The four criteria derived in sections (a) 
to (d) above show how the principle of likeli- 
hood provides a common basis for several tests 
of significance often required by the research 
worker. Furthermore, the manner of deriva- 


tion of these gives him a pattern to follow in 
the process of deriving appropriate criteria 
for the specific research problems he may 
have under investigation. 
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An examination of data, such as those dis- 
played in Tables I, II, and III giving the 
heights of American, Scottish, and English 
school children of different age groups, re- 
spectively, illustrates the difficulty encoun- 
tered in making comparisons of variability 
among the several groups or subgroups. It is 
observed that the means of the heights of the 
successive age groups increase from the young- 
est to the oldest groups and that in general 
the standard deviations also increase through 
the same range. In order to secure a valid 
comparison of variability, it would appear 
necessary to make allowance for the differ- 
ences in the means of the groups [2]. The 
present paper is concerned with this problem. 

In the first place, when the hypothesis that 
the samples come from populations with the 
same variances or standard deviations was 
tested, it was found that the hypothesis was 
rejected for the several age-groups in each of 
the six sex-nationality groups. In this case no 
consideration was given to the apparent rela- 
tionship between the means and standard 
deviations of the several age-groups (Table 


TESTING A CERTAIN HYPOTHESIS REGARDING 
VARIANCES AFFECTED BY MEANS 


PALMER O. JOHNSON and Fer TsAo 
University of Minnesota 


TABLE I 


IV). The procedure described below was de- 
veloped for testing the homogeneity of the 
variances of the several age-groups after mak- 
ing allowance for the variation in the means 
of the groups. 

We first replaced the original data in Tables 
I, II, and III, consisting of the 58 means and 
58 standard deviations of all the age groups 
by converting them to common logarithms by 
the following equations: 


X = 1000 log M — 1900 ----------- (1) 
Y = 1000 log S’ — 600 ------------- (2) 


The logarithmic transformation was made to 
stabilize the variances and to normalize the 
data [1]. 

By the usual methods we then obtained the 
regression equation for the prediction of Y 
from X for the 58 age-groups. 


¥ = 1.0049X + 8.15 -------------- (3) 


where I’ is the estimate of Y. 
Also op,,== .1517, ¢ = 6.62, P < .o1, 


showing significant regression of Y on X. 


HEIGHT (IN CENTIMETERS) OF AMERICAN CHILDREN (DATA FROM Boas)* 


Female 


Ss’ 


_.* Where M = mean value, S’ = unbiased standard deviation, N = number of cases. This 
will always be true throughout the paper. This table and Tables 2 and 3 are reproduced from 
McNemar, Q. and Terman, L. M.: “ differences in variational tendency,” Genetic P chology 
Monographs, 18, 1936, pp. 19-21, but the computation of S’ was made by the writers. The orig- 
inal tables have used different units of measurements. Here we use the same unit, i.e., centi- 


meter, to make the necessary analysis possible. 
145 
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Male 
N M a N M s’ ce 
5 1535 108. 42 4. 896 1260 105. 49 4.689 ae 
6 3975 111.78 5. 072 3618 110.70 5. 082 a 
7 5379 116. 89 5.274 4913 116.17 5.274 Po ae 
8 5633 122. 05 5. 595 5289 121.22 5.605 feta 
9 5531 126. 89 5. 736 5132 126.13 5. 764 ae 
10 5151 131.75 6. 057 4827 131.24 6.249 Po. 
11 4759 136.17 F6. 378 4507 136. 57 6. 847 aon 
12 4205 140. 68 6. 842 4187 142. 54 7. 584 BO ok 
13 3573 145. 91 7.695 3411 148. 58 7.415 ne 
14 2518 152.14 8. 684 2537 153. 41 6.731 tee 
15 1481 158. 50 8. 884 1656 156. 45 5. 980 best 
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TABLE II 
HEIGHT (IN CENTIMETERS) OF GLASGOW (SCOTLAND) SCHOOL CHILDREN 
(DATA FROM ELDERTON) 


Female 


M 


Male 


5 3322 106. 17 6.271 3104 105. 05 6. 485 
6 3901 110.74 6. 685 3828 109. 65 6. 627 
7 4200 115. 82 7. 358 3926 114. 30 6. 838 
8 4009 120.78 7.145 3817 119. 20 7.155 
9 3880 125. 48 7. 186 3762 123.95 7.313 
10 3759 129.77 7. 264 3518 128. 52 7. 595 
11 3632 134. 11 7. 483 3656 138.73 7. 907 


12 3638 138.25 7. 892 3224 139. 29 8. 405 
144. 48 . 995 


HEIGHT (IN CENTIMETERS) OF BARLEY (ENGLAND) SCHOOL CHILDREN 
(DATA FROM HABAKKUK) 


Male Female 
Age 
N M N M s’ 
3 145 93.24 4.720 151 92.28 4. 535 
4 664 96. 90 4.742 600 96. 31 4. 662 
5 282 102. 22 5. 620 346 101. 71 5. 440 
6 363 110. 36 5. 592 324 109. 43 5. 483 
7 467 113. 25 4.877 388 112. 76 5. 034 
x 861 120.14 5.725 820 119. 68 5. 507 
se) 692 125. 00 5.776 669 124. 35 6.095 
10 324 128. 46 6. 338 333 127. 37 6. 322 
12 140. 57 ; 


TABLE IV 
THE RESULTS OF THE L, TESTS DISREGARDING THE RELATIONSHIP BETWEEN M S’ 


Hypothesis 

Nationality Sex Li tested * 
. 963 Rejected 

Female_______- . 987 Rejected 

Female_______- - .947 Rejected 


* Hypothesis tested is the hypothesis that there is no significant difference between the variabilities of 
the different age groups. 


TABLE V 
Tue VALue or = (Y — Y)* ror Eacu Sex oF THE THREE NATIONALITIES 


Nationality Sex 


Age 
N M Ss’ N a 8’ | 
III 
Fema 35158 
emale 39216 
7573 


is 
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The next step consisted in the calculation 
of 3 (¥Y — Y)? for each of the six sex-nation- 
ality groups. The results are summarized in 
Table V. 

Finally, the discrepancies of the different 
values of & (Y — FY)? were tested by using 
the L, test [4, 5]. The ..ecessary calculations’ 
are given in Table VI. 


TABLE VI 
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mean for the several age groups in these six 
subclasses. We might say that the differentia- 
tion in variability for the several age groups 
in the subclasses was uniformly affected by 
the differentiation of mean values. It follows 
that the variability as first observed was a 
function of the mean rather than of age. We 
conclude, therefore, that the increase of vari- 
ability in height from youngest to the oldest 
groups was not a true situation, but that the 
variability in height was uniform from age to 
age in each of the six sex-nationality groups. 


CALCULATIONS OF THE L, Test ror = (Y — Y)* 


9 11 
9 11 1.0414 
7 9 . 9542 
7 9 . 9542 
7 9 . 9542 
7 9 
58 logn, =57. 2510 


I I 
log L, = log N log n, + 
s 
=n, log 6,’ — log (= 6,’) 
= 1.7634 — .9871 + 4.3274 — 5.1706 
= 9.9331 — 10 L, = 857 


Harmonic mean value [3] of f, = 
6 
9 9 7 7 7 7 


It is noted that the L, test can be used to 
test the homogeneity of several variances. In 


our case, =——— = (¥Y — P) is a variance of 


residuals. So we use formula (4) to determine 
the value of the criterion L, [6]. 


Referring to Nayer’s Tables [5] of the L, 
distribution with K = 6, and f = 7.56, it is 
found that the calculated L, is greater than 
the 5% point. So we concluded that there was 
no significant difference between these six 
variances of the residuals. In other words, 
there was no significant difference in the rela- 
tionship between the standard deviation and 


n, log n, 


6’, log 6’. n log 6’, 


20994 


35158 4. 5460 
34125 4. 5332 
39216 4. 5935 
11047 4. 0432 


7573 


148113 log 6’, =250. 9919 


SUMMARY 


When the measures of central position differ 
significantly, we cannot use directly the L, 
test in testing the hypothesis: A revised pro- 
cedure is as follows:* 

1. Make a logarithmic transformation of 

the raw scores. 7 
2. Find the regression equation for o, on x,. 
3. If the regression coefficient is significant, 
find the predicted c, values from the 
regression equation. 
4. Find the difference between the observed 
o, and the predicted a,. 

5. Use the sum of squares of these differ- 

ence scores as the basis for the L, test. 


An illustration has been given to show how 
this procedure was applied in testing the 
hypothesis of equality of standard deviations 
after adjustment for inequality of means. 
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The analysis of variance technique is grad- 
ually gaining recognition among workers in 
the field of educational research. There are, 
however, certain conditions under which this 
powerful statistical tool cannot be used, either 
because the variate is not normally distributed 
or because the data are qualitative and there- 
fore cannot be measured but only indicated 
in terms of ranks. Furthermore, the analysis 
of variance technique involves calculations 
which become rather laborious when the data 
is extensive, so that more convenient methods 
of analysis are in order even though they may 
be somewhat less efficient from the standpoint 
of statistical analysis. 

Fortunately two recent developments in 
statistical theory make possible the rigorous 
analysis of ranked data by methods very 
similar to the analysis of variance technique. 
The first which is reported by Friedman’ and 
which is referred to by him as the “method of 
ranks” is essentially an analysis of variance 
of ranks. The second tool is an extension of 
the “method of ranks” and results in what 
Wallis? calls the rank correlation ratio. Wallis 
points out that the statistic resulting from the 
“method of ranks” technique is the appro- 
priate test of significance for the rank corre- 
lation ratio and bears the same relation to it 
as does the analysis of variance to the ordi- 
nary correlation ratio. Both of the newer sta- 
tistics can be much more easily calculated 
than the conventional correlation ratio and 
the analysis of variance, and they are adapt- 
able to a wider range of problems. 


THE METHOD OF RANKS 


The method of ranks is appropriate when- 
ever the data can be thrown into a two-way 
table on the basis of two (or more) criteria. 
Each row of entries in the several columns is 
then ranked (1, 2, . p) in either the 


1 Milton Friedman, “The Use of Ranks to Avoid the 
rg of Normality Implicit in the Analysis of Vari- 
* Journal of the American Statistical ‘Anccittion, Vol. 32, 

1937, pp. 675-701. 
*W. Allen Wallis, “The Correlation Ratio for Ranked 
Data”, Journal of my American Statistical Association, Vol. 
34, 1939, pp. 533-538 


RECENT DEVELOPMENTS IN THE STATISTICAL ANALYSIS OF 
RANKED DATA ADAPTED TO EDUCATIONAL RESEARCH 


FRANK G. SCHULTZ 


South Dakota State College 


149 


increasing or decreasing order and the average 
rank (sum of ranks/number of rows) is cal- 
culated for each column. Assuming that the 
ranks in the rank matrix are distributed in 
random order (indicating that the ranks are 
independent of the variable represented by the 
columns) it should follow that the mean ranks 
of the columns should each approximate the 
mean rank of the table or 1/2 (p + 1) where 
p equals the number of columns. The extent 
of the deviations of the mean column ranks 
from the mean rank of the table forms the 
basis of the test which is designated by the 
statistic X,*. 

Friedman presents mathematical proof that 
the distribution of X,? is distributed as is Chi- 
square when ? is greater than 4 and the num- 
ber of rows is reasonably large and provides 
the appropriate test for the null hypothesis. 
In order to determine whether or not the 
obtained numerical value of this statistic is 
significant the Chi-square table is entered with 
(p — 1) degrees of freedom. He presents exact 
tables of probability for situations in which 
p equals four or less. 

The underlying theory of this statistic and 
the method of calculation can best be eluci- 
dated by working through a sample problem. 
It is hoped that sufficient detail of the method 
can be given to demonstrate the reasonable- 
ness and the adaptability of the test. However, 
it will be well for the reader to consult the 
original articles by Friedman ahd Wallis for 
a more thorough presentation of the statistical 
theory. 

A SAMPLE PROBLEM 


In a recent study concerned with the isola- 
tion of factors, associated with college attend- 
ance of high school graduates* it seemed im- 
portant to estimate the relationship between 
socio-economic status and the degree of college 
attendance. Since preliminary investigations 
indicated that the socio-economic status of 


* Frank G. Schultz, An Investigation of Factors Associated 
With the College Attendance of Wyoming High School Grad- 
uates, Unpublished Ph.D. Thesis, 1941. University of Minne- 
sota. 
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the students was related to college aptitude a 
two-way table was arranged to show the per 
cent of college attendance for the various 
levels of socio-economic status and college 
aptitude. In this way the effect of college 
aptitude was controlled. The possible relation- 
ship between college attendance and sex was 
also controlled by making separate tables for 
men and women students. The basic data for 
this phase of the investigation are shown in 


Table I. 
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THEORY AND CALCULATION OF THE 
X,” STATISTIC 


Wallis defines the X,* statistic “as (p — 1) 
times the ratio of the actual variance among 
column means to the variance expected on the 
basis of the null hypothesis”,* the symbol “p” 
representing the number of columns in the 
rank matrix. Since it has been shown by 
Friedman* that the mean rank of a table of 
ranks is 1/2 (p+ 1), and the variance 


NUMBER AND PER CENT OF MALE HIGH ScHOOL GRADUATES ATTENDING COLLEGE FoR EACH 


COLLEGE APTITUDE AND Soci0o-EcoNoMIc STATUS INTERVAL 
Socio-Economic Status Intervals® 


College 

Aptitudes*’ — — - - 

Interval 0-14 15-18 19-22 23-26 27-30 31+ Total 

ee 4/i1l 2/3 3/5 6/10 5/8 26/26 46/63 

36. 36 66. 67 60.00 60. 00 62. 50 100. 00 73. 02 

90-99_ ‘ 6/12 5/7 7/8 6/8 3/8 23 /26 49/69 

41. 67 71.43 87. 50 75. 00 37. 50 88. 46 71.01 

80-89 __ 5/16 8/17 9/17 8/16 4/11 18/24 52/101 

31.25 47.06 52.94 50. 00 36. 36 75. 00 51. 49 

70-79___- 6/25 4/9 7/10 5/14 10/19 20/34 §2/111 

24.00 44.44 70. 00 35. 71 52. 63 58. 82 46. 86 

60-69 10/38 12/26 7/16 11/20 7/15 24/37 71/152 

26. 32 46.15 43.75 55. 00 46. 67 64. 86 46.71 

60-59__._.__- 8/42 5/26 14/30 6/22 9/21 16/30 58/171 

19. 05 19. 23 46. 67 27.27 42. 86 53. 33 33. 92 

40-49_...____. 13/37 5/27 7/31 1/22 3/14 12/21 41/152 

35.14 18. 52 22. 58 4. 55 21.43 57.14 26. 97 

ae 9/53 2/37 8/43 1/14 3/17 2/9 25/173 

16.98 5. 41 18. 60 7.14 17. 65 22.22 14. 45 

60/234 43/152 62/160 44/126 44/113 141/207 394/992 

25. 64 28.29 38.75 34. 92 38.94 68.12 39. 72 


Table reads as follows: Numerator of the fraction in the cell indicates the number of students attending 
college. The denominator indicates the total number of students falling in that cell. The number im- 
mediately below the fraction indicates the per cent of students attending college within four years follow- 


school. 


ing from high 


ollege aptitude in terms of scores made on the Ohio State University Psychological Examination, 


Form 18. 


The next step in the calculation of the X,* 
statistic is to form Table II by ranking the 
percentages in each row of Table I. While the 
percentages may be ranked either in the 
ascending or descending order, in this case the 
descending order is used, giving the highest 
percentage in each row a rank of 1 and the 
lowest a rank of 6. Percentages which are 
equal or which differ from each other by only 
small amounts are given the average rank. 
The total column in Table I does not enter 
into the analysis and hence is disregarded. 
The numbers at the foot of the table repre- 
sent steps in the calculation of X,* and will 
be explained later. 


> Socio-economic status measured in terms of an adapted form of the Sims score card. 


(?? — t)/t2n, in which “n” equals the num- 
ber of rows in the table oe above mentioned 
verbal definition can be translated into: 


1/2 (p + 1)!" 
(p* — 1) 
12” 


(in which 7, equals the mean rank of the jth 
column.) 

By simple algebraic manipulation this 
equation becomes: 


*W. Allen Wallis, Op. cit., p. 533. 
© Milton Friedman, Op. cit., p. 678. 
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STATISTICAL ANALYSIS OF RANKED DATA 


TABLE II 


ANALYSIS OF THE RELATION BETWEEN Soci0-EcoNoMIC STATUS AND COLLEGE ATTENDANCE 
FOR MALE HIGH SCHOOL GRADUATES WHEN COLLEGE APTITUDE IS CONTROLLED 


Ranks Based on Per Cent of College Attendance by S. E. S. Intervals 


51 


Aptitude 
Intervals 


15-18 
2 


0-14 


15-39___- 
Sum of Ranks 
Mean Rank 
Deviation from 
Theoretical Mean 
Deviations Squared 


5 
313 


+.813 
. 660969 


o 


4 

4 

4 

4 

5.5 

5 

6 
34. 

4. 


+1. 563 
2. 442969 


rani r,;—1/2 (p + 
1) 


For the data and calculations in Table II 
the numerical value of the statistic becomes: 


12(8) 
6(7) 
Entering the Chi-square table with (p — 1) 


(9.783376) which equals 22.365 


= 


of less than .o1. Thus we are justified in con- 
cluding that the statistic is highly significant 
and that there is a real association between 
socio-economic status and college attendance. 

Had the results of the X,* test given us 
reason to doubt the existence of a significant 
relationship between the two factors under 
consideration, there would have been no valid 
reason for continuing the investigation further. 
Under the circumstances, however, it will be 
appropriate to measure the intensity of the 
relationship between socio-economic status 
and per cent of college attendance. 


MEASURE OF INTENSITY OF RELATIONSHIP 


A glance at the derived figures at the foot 
of Table II will give an indication of the 
nature of the relationship between socio- 
economic status and the degree of college 
attendance of high school graduates. It will 
be seen that the mean column ranks decrease 
rather uniformly when reading from left to 
right, with only the mean of the 19—22 column 
being out of place. In order to arrive at a 
qualitative measure of the relationship, how- 


Theoretical mean equals 3.5. Sum of deviations squared equals 9.783376. 


degrees of freedom we find a probability value. 


23-26 27-30 
4.5 3 


6 
5 
3 
3 


19-22 
4.5 


5 3 
063 


+. 563 
. 316969 


21.5 3 
2. 686 


—. 812 
. 669344 


3 
3 
5 
2 
4 3 
6 4 
5 3 
2. 0.0 
4. 3. 750 


. 250 
. 062500 


ever, we shall calculate the rank correlation 
ratio previously mentioned. 

Wallis’ defines this statistic (Eta,) as 
“equal to the ratio of the sum of squares be- 
tween columns to the total sum of squares”, 
or as “the proportion of the total variance 
which is attributable to the variance of the 
column means.” Since these sums of squares 
enter into the calculation of X,” it will readily 
be seen that the rank correlation ratio is an 
extension of the analysis of variance of ranks 
previously illustrated. 

Expressing the latter of the two definitions 
in the form an equation we get: 


Eta? — Variance between column means 
" “~~ Variance of the total rank matrix 


Now it can be shown that the variance be- 
tween column means is obtained by finding 
the sum of the squares of the differences be- 
tween the individual column means and the 
theoretical mean and dividing the summation 
bv the number of columns ranked (p in our 
nu’»tion). The variance between columns is 
thus equal to r; — 1/2 (Pp + 1) ;*/p. How- 
ever, since the numerator of the preceding 
quantity also enters into the calculation of 
X,” it will save time to express the sum of 
squares between means by the equivalent 
X,* p(p + 1) 12m, so that the variance be- 
tween means becomes X,? (p + 1)/12”. 

The variance of the total rank matrix can 
be obtained by summing the squares of the 
difference between each individual rank in the 
*W. Allen Wallis, Op. cit., p. 535. 
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table and the theoretical mean (mean of the 
total table) of the matrix and dividing this 
quantity by the total number of individual 
ranks, i.e., » p. Friedman’ presents a method 
of finding the total variance which involves 


12 


only the simple formula, 


Thus it follows that the rank correlation 
ratio may be obtained through the application 
of the relatively simple formula: 

(p + 1)/12n x,? 

(p? —1)/12 n (Pp —1) 


Applying the rank correlation ratio formula 
to the sample problem as presented in Table 
II, we find that 


22.365 
Eta,? = 
and 


Eta, = .748 or .75. 


Wallis points out that since the rank cor- 
relation ratio is calculated only after one of 
the variable is controlled (in this case college 
aptitude), Eta, is in reality a “partial” cor- 
relation ratio. The efficacy of the rank cor- 
relation ratio method is indicated in part by 
the fact that a point bi-serial correlation be- 
tween socio-economic status and _ colleg: 
attendance, by which method the college apti- 
tude variable is not controlled, resulted in a 
coefficient of only .30. 


SUMMARY 


The purpose of this article is to show, by 
way of application to a practical problem, how 
the analysis of variance of ranks and the rank 
correlation ratio, developed by Friedman and 
Wallis respectively, may be adapted to educa- 
tional research. 

* Milton Friedman, OP. cit., p. 678. 
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It is pointed out that the “method of ranks” 
can be used under conditions in which the 
regular analysis of variance would not be 
appropriate, e.g., when the basic data are not 
normally distributed or when the data are 
qualitative rather than quantitative. Another 
advantage of the “method of ranks” lies in the 
comparative ease with which the calculations 
can be made. While it is somewhat less effi- 
cient than the regular analysis of variance it 
requires only a fraction of the amount of time 
to run the test. 

The X,* statistic is distributed as is chi- 
square when the number of columns in the 
rank matrix is reasonably large. For tables of 
less than five columns Friedman presents 
exact tables which must be used to determine 
the probability of a certain X,* values 
occurring. 

The rank correlation ratio (Eta,) suggested 
by Wallis is the appropriate measure of rela- 
tionship between the factors under study when 
X,* is known to be significant. In view of the 
fact that Eta, provides for the control of at 
least one variable while another is under 
investigation, this statistic is in reality a 
“partial” correlation ratio. 

Both techniques should provide handy tools 
for the worker in educational research. They 
can be relatively easily understood and their 
use will frequently provide further insight into 
the nature of the conventional analysis of 
variance. 

The application of these tools to the prac- 
tical problem shows that there is a significant 
relationship between socio-economic status 
and per cent of college attendance when the 
college aptitude factor is controlled. Eta, in 
this problem is shown to be .75, a correlation 
which is much higher than that obtained when 
the college aptitude factor is not controlled. 
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